Coding agent

Coding agents write code, run it, read errors, fix, iterate. The test suite is the verifier; the error trace is the feedback. Claude Code, Cursor, and tools like Devin are built on this pattern.

The loop

  1. Read the task
  2. Explore the codebase (read files, search)
  3. Plan changes
  4. Write code
  5. Run tests
  6. Read errors if any
  7. Fix and iterate until passing
  8. Submit changes

Core tools

Why this works

Code has a natural verifier: compilers, test suites, linters. When the agent writes bad code, the feedback is unambiguous. The LLM uses the error message to plan the fix. Few domains have this tight feedback loop, it's why coding agents are ahead of other verticals.

Context management

Codebases are too big to fit in context. Agent must:

Testing matters

Agent quality tracks test suite quality. Without tests, agent has no feedback. Encourage users to have good tests before deploying a coding agent.

Safety

Coding agents can do damage. Guardrails:

Tools in the ecosystem