When an agent forgets, yelling "memory" is like yelling "engine" at a broken car.
Very dramatic. Not very diagnostic.
Many common agent failures that look like "bad memory" fall into three monsters plus one swamp:
- Active context: what the model can see right now.
- Durable memory: stable facts that should survive across sessions.
- Source-grounded retrieval: looking up the file, note, repo, transcript, invoice, or spec before answering.
- Profile and instruction bloat: the rules and persona text you keep stuffing into the prompt until the agent needs a forklift to think.
If you treat the three monsters and the bloat swamp as the same thing, you will fix the wrong layer.
This checklist is for agent users, builders, and operators who keep saying "it forgot" but cannot tell which layer failed.
The quick diagnostic
Use this table before you add another memory plugin, paste a giant instruction block, or yell at the machine like that will help.
| Symptom | Likely monster | What probably happened | Better fix |
|---|---|---|---|
| "It ignored what I said five minutes ago." | Active context | The detail was never in the current prompt, got summarized away, or was buried under too much other material. | Re-state the critical fact in the current task, shorten the working context, or pin the key instruction near the work. |
| "It forgot my preference from last week." | Durable memory | The preference was never saved, was saved too vaguely, or was overwritten by stale notes. | Save a compact stable fact. Delete or replace stale memories. Test recall later. |
| "It answered without reading the document." | Source-grounded retrieval | The agent guessed from memory or vibes when it needed the source. | Make source lookup mandatory. Require file paths, citations, or read-back evidence before the answer. |
| "It keeps following old project instructions." | Profile/instruction bloat | The profile became a landfill for old rules, temporary status, workflows, and emotional support paragraphs. | Move procedures to skills, current status to project state, and stale facts to the bin. |
| "It knows the rule but still does the wrong thing." | Mixed failure | The rule may be stored, but it was not visible at decision time, or it conflicts with another instruction. | Check the active prompt, the durable memory, and the profile text for conflict. Then remove the weaker rule. |
Monster 1: active context
Active context is what the model can see right now. Not what exists somewhere on disk. Not what you told it in a different chat. Not what you wish it had inferred from your tone.
If a detail is outside the current context window, the agent is not "forgetful." It is blind.
Toy example:
- Bad: "Use the brief we discussed before."
- Better: "Use
brief.md; read it before drafting; cite the three claims you used."
Active context problems usually show up during long chats, large tasks, or multi-step work. The agent starts confidently dropping constraints because the important line is buried under a mountain of old discussion.
Quick checks:
- Did the agent read the current source file in this run?
- Is the key constraint visible in the latest task message or working notes?
- Did a summary remove the detail that mattered?
- Is the prompt carrying so much tool/schema/profile text that the task itself is competing for oxygen?
Fixes:
- Put the current task and non-negotiables near the top of the working context.
- Use shorter task packets.
- Ask for read-back of critical constraints before irreversible work.
- Split giant jobs into source lookup, draft, verification, and handoff.
Monster 2: durable memory
Durable memory is for stable facts that should survive sessions.
Good durable memory:
- "User prefers concise progress updates."
- "Project uses the
/outputs/directory for final artifacts." - "Do not use real customer screenshots in public examples."
Bad durable memory:
- "Draft 2 is in progress."
- "Fix the checklist tomorrow."
- "The last run failed on line 83."
- Every thought anyone had during a three-hour debugging session.
Memory is not a diary. It is not a todo list. It is not a trophy cabinet for old decisions that stopped being true.
If you save temporary status as durable memory, the agent will eventually treat yesterday's mud as today's law. Congratulations, you built a tiny haunted bureaucracy.
Quick checks:
- Is the fact still likely to be true next month?
- Is it short enough to be useful when injected into future work?
- Does it reduce the user's need to repeat themselves?
- Is there a stale memory that now contradicts it?
Fixes:
- Save stable preferences, environment facts, and durable conventions.
- Replace stale memories instead of piling new ones on top.
- Keep task progress in project state, issue trackers, logs, or handoff files.
- Test recall with a simple future prompt: "What do you know about my preference for X?"
Monster 3: source-grounded retrieval
Retrieval is the boring adult in the room.
When an answer depends on a document, repo, transcript, contract, invoice, research brief, support thread, or dashboard, the agent should fetch the source. Durable memory can remind it that a source exists. Active context can carry a small excerpt. Neither replaces reading the thing.
Toy example:
- Bad: "Summarize the contract from memory."
- Better: "Read
contract.md, quote no private clauses in the public version, and list the sections used in the source map."
Retrieval problems often masquerade as confidence. The agent sounds fluent because language models are excellent at sounding fluent. That does not mean the file was read.
Quick checks:
- Did the agent name the source it used?
- Did it read the source during this run, or is it relying on old context?
- Does the answer separate evidence, inference, and assumption?
- Can you trace the claim back to a file, URL, or record?
Fixes:
- Require source paths or citation IDs for claims that matter.
- Require read-back verification before final output.
- Keep a source map for public resources.
- Make "I could not access the source" an allowed answer. It is better than a confident fake.
The swamp: profile and instruction bloat
Profile instructions tell the agent how to behave. They should not become a compost heap.
A useful profile says what role the agent plays, what quality bar it must meet, what safety boundaries matter, and where procedures live. A bloated profile tries to carry everything: old project status, one-off preferences, tool manuals, motivational slogans, stale plans, and five versions of the same rule fighting in a trench coat.
Toy example:
- Bad profile note: "We are currently working on Draft 2; if the user asks about memory, mention yesterday's checklist; also remember the test failed once; also here are 400 lines of setup notes."
- Better split:
- Profile: "This agent writes public-safe operational content."
- Memory: "User prefers blunt useful-first content."
- Skill: "Public resource verification workflow."
- Project state: "Draft 2 awaiting risk review."
Quick checks:
- Is this instruction still true across many tasks?
- Does it belong in a reusable skill instead?
- Is it temporary project status?
- Does it conflict with another instruction?
- Would removing it make the agent less confused?
Fixes:
- Keep role and safety rules in the profile.
- Put procedures in skills or playbooks.
- Put current status in project files, task comments, or trackers.
- Review old profile text like you review old code: if it has no job, delete it.
The five-minute triage checklist
Before changing your agent setup, answer these in order.
| Question | If yes | If no |
|---|---|---|
| Did the agent have the needed detail in active context? | Look for conflicts or weak wording. | Put the detail in the current task packet. |
| Is the detail a stable fact worth remembering? | Save it as compact durable memory. | Put it in project state, not memory. |
| Does the answer depend on a source? | Force retrieval and record the source used. | Do not pretend retrieval is the fix. |
| Is the profile carrying temporary or stale material? | Move it out or delete it. | Keep the profile lean. |
| Can you reproduce the failure with a toy prompt? | Fix the smallest failing layer. | Gather evidence before redesigning the system. |
Three toy failures and the right fix
Failure A: "The agent forgot I like short updates"
Diagnosis: durable memory.
Fix: save one stable preference: "User prefers concise progress updates." Do not save the whole conversation where they complained. The agent does not need a museum exhibit.
Failure B: "The agent wrote the wrong title even though the brief had the title"
Diagnosis: active context or retrieval.
Fix: make the agent read the brief in the current run and repeat the required title before drafting. If the title is in a file, retrieval is part of the job.
Failure C: "The agent keeps referencing an old campaign"
Diagnosis: profile bloat or stale memory.
Fix: find where the old campaign is stored. If it is in memory, replace or remove it. If it is in the profile, move current campaign status into project state and leave the profile for stable role rules.
A practical storage rule
Use this when deciding where information belongs:
| Information type | Put it here | Example |
|---|---|---|
| Stable preference | Durable memory | "User prefers no fake enthusiasm." |
| Reusable workflow | Skill/playbook | "How to verify a public markdown resource." |
| Current task status | Project state or task tracker | "Draft complete; awaiting review." |
| Evidence for a claim | Source file, URL, or source map | "Generalized reviewed-source note." |
| Role and boundary | Profile instructions | "This agent writes public-safe content and refuses live posting without approval." |
| One-off scratch thought | Nowhere durable | "Maybe use a goblin joke here." |
What to do next
Pick one recent "the agent forgot" incident and classify it:
- Was the missing detail in active context?
- Should it have been durable memory?
- Was a source lookup required?
- Did profile bloat or stale instructions push the agent the wrong way?
Then fix one layer. Not all of them. All-at-once cleanup feels productive until you accidentally teach the goblin seventeen new ways to be wrong.
Memory is not a vibe.
It is context discipline, durable facts, source retrieval, and hygiene.
Find which monster bit you.
Evidence and boundaries
This resource is based on generalized patterns from agent-builder discussions reviewed internally. It does not quote raw archive material, name participants, use private chat fragments, or imply affiliation, endorsement, partnership, or community membership with any project, platform, company, or community.
It is a diagnostic checklist, not a product claim, benchmark, or universal law. Memory tools, context systems, retrieval layers, and profile conventions vary. Test your own setup before treating any rule as law.