Home›Expertise›AI Agents›Procedural memory

Procedural memory

📖 2 min readUpdated 2026-04-19

Procedural memory is the agent remembering how to do things. Not facts, not past events. Skills. "Here's the recipe that worked for this kind of task." "Here are the defaults that made it succeed." An agent with procedural memory gets better at recurring tasks over time because it doesn't reason from scratch every run. It starts from a known-good recipe and adapts.

What a procedure is

A procedure is a named, reusable recipe for a class of task. It includes the steps, the default parameters, the constraints, and the signals that a given situation matches it.

{
  "procedure_id": "deploy_staging",
  "description": "Deploy a branch to staging",
  "when_to_use": "User says 'deploy to staging' or 'push this to staging'",
  "steps": [
    "Confirm branch is up to date with main",
    "Run tests locally",
    "Push to remote",
    "Trigger staging deploy workflow",
    "Verify at staging.example.com"
  ],
  "defaults": {"timeout_minutes": 10},
  "constraints": "Don't deploy on Fridays after 3pm ET"
}

How procedures get built

The first time you ask the agent to do a task, it explores: tries things, hits dead ends, eventually succeeds. Post-task, you (or the agent) distill that successful trajectory into a procedure. Next time, the agent retrieves the procedure and follows it (or adapts when the situation is different).

How procedures get used

At the start of a task, the agent searches the procedure library: "Does any existing procedure match this task?" If yes, it retrieves the steps and uses them as a scaffold. If no, it plans from scratch and maybe writes a new procedure after.

The retrieval can be keyword-based (simple: match on description), vector-based (semantic: current task embedding vs procedure embeddings), or explicit (the user says "use the staging deploy procedure").

A worked example: a customer-support agent getting smarter

Month 1. Agent sees 50 "my invoice is wrong" tickets. It figures out the right flow: check the customer record, pull the latest invoice, compare line items to the order history, identify the discrepancy, either issue a credit or escalate if over $200.

Post-ticket distillation. Agent writes a procedure resolve_invoice_discrepancy with those five steps and a "escalate above $200" constraint.

Month 2. A new "my invoice is wrong" ticket arrives. Agent retrieves resolve_invoice_discrepancy, follows it directly. Resolves in 4 tool calls instead of 12. Success rate climbs from 70% to 95%.

Same model. Same tools. The only thing that changed is the agent started with a known recipe instead of exploring from scratch.

The four memory types, together

Who writes procedures

Humans. Safest and cleanest for high-stakes tasks. Team writes procedures; agent uses them.
Agents, auto-distilled. Post-task, a separate LLM call summarizes the successful trajectory into a procedure. Cheap, fast, noisy. Good for low-stakes tasks.
Hybrid. Agent drafts a procedure; human reviews and publishes. Usually the best of both.

Refinement over time

Procedures drift from reality. Build in review:

Track success rate per procedure. If a procedure starts failing, flag for review.
Log adaptations the agent had to make ("step 3 failed, had to try Y"). Feed these back into the procedure.
Version procedures, keep the old versions, measure whether updates actually help.

Pitfalls

Over-trusting procedures. Following a stale procedure blindly. Procedures are scaffolds, not scripts.
No match signal. Agent uses a procedure that doesn't actually fit. Add explicit "when to use" criteria.
Auto-writing from failed runs. Model mistakes a failure for success and writes a bad procedure. Validate before publishing.
Procedure sprawl. 200 procedures, agent can't pick the right one. Merge, rename, keep the library tight.

What to do with this

Write 3-5 procedures for your agent's most common tasks by hand. You'll feel the quality improvement immediately.
Read memory system design for composing all four memory types.
Read planning loops, which frequently retrieve procedures as the plan skeleton.