Audience: builders, operators, and small teams whose agents keep forgetting things, remembering the wrong things, mixing projects, bloating their instructions, or confidently guessing when a source file exists.
Promise: a practical sorting system for what belongs in active context, durable memory, reusable skills, project state, source-grounded retrieval, and proof artifacts — so your agent becomes less haunted filing cabinet and more useful operator.
Ana version: sharp, useful, allergic to mystical “memory will fix it” fog. Memory is not a vibe. It is a boundary, a cleanup habit, and a receipt trail.
The blunt premise
When an agent forgets, yelling “memory” is like yelling “engine” at a broken car.
Very dramatic.
Not very diagnostic.
“The agent forgot” can mean at least six different failures:
- It could not see the needed context in the current conversation.
- The stable fact was never saved as durable memory.
- A reusable procedure was trapped in a one-off chat instead of a skill.
- Current project status was buried in someone’s memory when it belonged in a state file or task log.
- The agent trusted memory when it should have fetched the source.
- Nobody left proof that the work happened.
If you dump all six into one mystical bucket called “memory,” congratulations: you built a haunted filing cabinet with autocomplete.
This checklist separates the monsters.
Quick sorting rule
Use this before saving anything.
| If the information is... | Put it in... | Example using toy data | Why |
|---|---|---|---|
| Needed only for this conversation or task | Active context | “For this draft, use the mushroom-shop example.” | Keeps temporary instructions temporary |
| A stable preference or environment fact | Durable memory | “Sam prefers weekly summaries under five bullets.” | Saves repeated steering without storing the whole project |
| A reusable workflow, command sequence, checklist, or pitfall | Skill / procedure | “How to verify a local report before handoff.” | Makes repeat work reliable and maintainable |
| Current project status, selected assets, approvals, manifests, task outcomes | Project state / task log | “Draft B selected; awaiting risk review.” | Prevents stale operational state from masquerading as permanent truth |
| A claim that depends on a file, doc, repo, transcript, invoice, or live system | Source-grounded retrieval | “Read the current contract before summarizing payment terms.” | Stops confidence theatre when evidence exists |
| A completion claim | Proof artifact | “Report path, byte count, checksum, source map, test output.” | Makes “done” auditable instead of decorative |
Tattoo it on the goblins: facts in memory, procedures in skills, current work in state, evidence from sources, proof in artifacts.
Layer 1: active context
Active context is what the agent can see right now: the current request, the messages in scope, and any files or outputs loaded into the working window.
Use active context for:
- temporary task preferences;
- one-time constraints;
- drafts and options currently being compared;
- source excerpts you are actively reasoning over;
- approvals that apply only to this task and have not been recorded elsewhere.
Do not use active context as:
- a permanent knowledge base;
- a project database;
- a dumping ground for every old decision;
- proof that something will be remembered tomorrow.
Checklist:
- [ ] Is this useful only for the current task?
- [ ] Would saving it permanently create confusion later?
- [ ] Does the next worker need it in a project state file instead?
- [ ] If the conversation disappears, is there a durable artifact that matters?
Toy example:
- Good active-context note: “For this example, call the fictional customer
Maple Bike Shop.” - Bad durable-memory entry: “Maple Bike Shop is the current customer.” That is fictional, task-specific, and begging to bite you later.
Layer 2: durable memory
Durable memory is for stable facts the user or team would reasonably expect the agent to remember across sessions.
Good durable memory is compact, declarative, and boring in the best way.
Use durable memory for:
- user communication preferences;
- stable project conventions;
- recurring environment facts;
- long-lived boundaries the user repeatedly enforces;
- corrections that prevent future mistakes.
Do not use durable memory for:
- secrets or credential values;
- raw customer data;
- temporary task progress;
- “today’s selected draft” unless it is a durable project decision recorded elsewhere too;
- stale approvals;
- full transcripts, private logs, or giant source dumps.
Good memory entry shape:
- “User prefers concise Friday status updates with links to artifacts.”
- “Project Phoenix uses the
public-safe examples onlyrule for external resources.”
Bad memory entry shape:
- “Remember to finish the Tuesday report.”
- “The current token is
example-redacted.” - “Draft 3 was approved today, probably.”
Why the bad ones fail:
- TODOs belong in a task system.
- Credential values do not belong in memory.
- Approval state belongs in a dated project log with source and scope.
Durable memory checklist:
- [ ] Will this still be true in a month?
- [ ] Does it reduce future user repetition?
- [ ] Is it short enough to avoid prompt bloat?
- [ ] Is it declarative, not an imperative that could override a future request?
- [ ] Is it free of secrets, customer records, raw logs, and temporary progress?
- [ ] Is there a plan to remove or update it when it becomes stale?
Layer 3: skills and procedures
A skill is a reusable procedure: the steps, pitfalls, commands, criteria, and verification habits that make a task work the next time.
Use skills for:
- recurring workflows;
- setup recipes;
- quality gates;
- troubleshooting playbooks;
- review checklists;
- known pitfalls and recovery steps.
Do not use skills for:
- one-off task status;
- private customer details;
- credential values;
- “current campaign is approved” notes;
- giant copies of external docs that will rot.
A good skill answers:
- When should this skill be used?
- What are the exact steps?
- What must be verified?
- What breaks often?
- What should the agent refuse or escalate?
Toy example:
- Memory: “Sam prefers short audio summaries when driving.”
- Skill: “How to produce a short audio summary: draft under 60 seconds, avoid numbers not in the source, generate audio, verify file exists, report path.”
- Project state: “Episode 4 audio draft generated; awaiting review.”
If you keep rediscovering the same workflow in chat, stop making the goblin solve a maze it has already solved. Turn the maze into a skill.
Skill hygiene checklist:
- [ ] Does the workflow happen more than once?
- [ ] Are steps specific enough that a different worker can use them?
- [ ] Are commands or tool names current, tested, or clearly labeled conceptual?
- [ ] Does the skill include verification, not just creation?
- [ ] Are private examples replaced with toy examples?
- [ ] Is there a maintenance path when the workflow changes?
Layer 4: project state
Project state is the current truth of the work: what is selected, shipped, blocked, rejected, approved, pending, or superseded.
Use project state for:
- current campaign status;
- selected/rejected assets;
- task outcomes;
- dated approvals and their scope;
- manifests and source maps;
- next gates and blockers;
- “this public page is live” or “this is only a draft.”
Do not store current project state only in durable memory. That is how stale facts put on a tiny crown and start issuing orders.
Project-state checklist:
- [ ] Is this fact about the current status of a project?
- [ ] Does it need a date, source, owner, or approval scope?
- [ ] Could it change next week?
- [ ] Will downstream workers need to read it without relying on chat history?
- [ ] Does it belong in a manifest, task comment, changelog, run report, or source map?
Toy example:
- Good project state: “2026-06-25:
diagram-v2.pngselected for the guide; publication still requires risk review.” - Bad durable memory: “Use diagram v2.” In two weeks, that turns into a ghost instruction.
Layer 5: source-grounded retrieval
Source-grounded retrieval means the agent goes back to the file, document, database, transcript, repo, dashboard, or official source before making a claim that depends on it.
Use retrieval when the answer depends on:
- a current file;
- a legal or policy document;
- a customer record;
- a source quote;
- a repository state;
- a public page that may have changed;
- a vendor or platform rule;
- a previous deliverable.
Memory can remind the agent where to look. It should not replace looking.
Retrieval checklist:
- [ ] Is there an original source for this fact?
- [ ] Could the source have changed?
- [ ] Would guessing create legal, financial, reputational, or operational risk?
- [ ] Did the agent separate direct evidence from inference?
- [ ] Did it record sources used in the final artifact?
Toy example:
- Good: “I remember the project stores approved examples in a state file; I will read the state file before naming the approved example.”
- Bad: “I remember this was approved, so let’s publish it.” That sentence is how goblins get lawsuits.
Layer 6: proof artifacts
Proof artifacts are the receipts that make agent work inspectable.
Use proof artifacts for:
- final file paths or public-safe labels;
- byte counts and checksums where practical;
- source maps;
- verification logs;
- screenshots or render checks when visual output matters;
- external side-effect records;
- approval state.
A proof artifact does not need to be fancy. It needs to be findable, specific, and true.
Proof checklist:
- [ ] What exactly was produced?
- [ ] Where is the durable version?
- [ ] Was it read back after writing?
- [ ] Were JSON/config/code files syntax-checked?
- [ ] Were public-facing files scanned for private paths, secrets, customer data, unsupported claims, and pricing/service promises?
- [ ] Were sources listed?
- [ ] Were external side effects recorded as “none” or listed precisely?
- [ ] Is the approval status explicit?
No receipt, no victory lap.
Do not store this
| Do not store | Why it is dangerous | Safer place or action | Toy example |
|---|---|---|---|
| Secrets | Memory and docs can be exposed, copied, summarized, or pasted into the wrong place | Secret manager or approved credential store; record capability, never value | Store “email sending is configured,” not the password |
| Temporary task progress | It becomes stale almost immediately and pollutes future sessions | Task board, run report, or project log | “Draft half done” belongs in a task comment |
| Raw customer data | Privacy, consent, retention, and leakage risk | Approved CRM/data store with policy controls; summarize only when allowed | Do not save a customer’s full complaint history as memory |
| Private logs | Logs often contain identifiers, paths, tokens, and operational details | Keep in internal log storage; cite only sanitized findings | Do not paste stack traces with hidden account IDs into public resources |
| Stale approvals | Approvals have scope, date, owner, and expiry; memory strips that context | Dated approval record or project state file | “Approved for this internal draft” is not “approved forever” |
| Credentials | Credential labels can be okay; credential values are radioactive | Secure credential store; mention only the required capability | “Needs read-only calendar access” is fine; the access key is not |
| Private profile instructions | They can reveal internal strategy, tool access, or user preferences | Generalize into public-safe principles | Say “use profile separation,” not “copy the private profile text” |
| Real account/channel IDs | They can enable targeting, scraping, impersonation, or accidental public action | Use fictional IDs or placeholders | Use channel-123-example, not a real destination ID |
| Full source dumps | They bloat context and can violate privacy or copyright expectations | Source map plus short cited excerpts where allowed | Summarize the finding; do not paste the entire transcript |
Cleanup rhythm
Memory hygiene is maintenance, not a spiritual awakening.
Weekly or per-project:
- [ ] Remove completed TODOs from memory-like places.
- [ ] Move current status into project state.
- [ ] Convert repeated workflows into skills.
- [ ] Delete or update stale approvals.
- [ ] Check for contradictory durable facts.
- [ ] Confirm source maps point to the right evidence.
- [ ] Verify public drafts contain toy examples only.
Monthly or after major workflow changes:
- [ ] Review durable memories for stale environment facts.
- [ ] Patch skills with newly discovered pitfalls.
- [ ] Archive superseded state files instead of letting them compete.
- [ ] Check that public resources do not expose private paths, account facts, or real logs.
- [ ] Make sure proof requirements still match the risk of the work.
If the cleanup feels boring, good. Boring is where the reliable money hides.
Diagnostic: which monster bit you?
| Symptom | Likely cause | First fix |
|---|---|---|
| The agent ignores a file you know exists | Source retrieval failure | Tell it to read the source and cite it before answering |
| The agent keeps asking the same stable preference | Missing durable memory | Save a compact, declarative preference if safe |
| The agent repeats a known workflow badly | Missing or stale skill | Create or patch the procedure with verification steps |
| The agent uses last week’s approval as if it still applies | Project state stored as memory | Move approval status into dated project state with scope |
| The agent gets slower and more confused over time | Instruction/profile bloat | Remove stale rules; move procedures into skills and status into state |
| The agent claims something is done but nobody can find it | Missing proof artifact | Require durable path, read-back, byte count/hash, and verification |
| The agent mixes two projects | Weak profile/project separation | Separate memory/state/skills by project or profile; do not let one lane quietly rewrite another |
Public-safe examples you can copy
These are fictional. Keep them fictional unless a real source has been approved for publication.
Durable memory example
Good:
“Mina prefers release notes grouped as Added, Changed, Fixed, and Risks.”
Why: stable preference, compact, no private data.
Bad:
“Mina approved the current campaign and said to post it everywhere.”
Why: approval scope is missing, likely stale, and too risky for memory.
Skill example
Good:
“Before publishing a report: read back final file, validate JSON, scan for private paths and credentials, record sources and external side effects.”
Why: reusable procedure with verification.
Bad:
“Use the secret launch checklist from Client A.”
Why: private client process and potentially sensitive details.
Project state example
Good:
“2026-06-25: Resource draft completed; publication blocked until risk review and destination URL are approved.”
Why: dated, scoped, and operational.
Bad:
“The resource is ready.”
Why: ready for what — review, publishing, outreach, sale, or deletion by goblins?
Source retrieval example
Good:
“The guide says to verify commands against current official docs before printing them as instructions.”
Why: source dependency is explicit.
Bad:
“I remember the command, probably.”
Why: probably is not a release process.
Proof artifact example
Good:
“Created the checklist, source map, example matrix, and verification record; JSON syntax checks passed; public markdown scan passed; no external side effects.”
Why: names artifacts and checks.
Bad:
“Done.”
Why: decorative noise with shoes on.
Pre-save checklist
Before you save information anywhere durable, ask:
- Is it stable?
- Is it safe?
- Is it compact?
- Is it in the right layer?
- Does it reduce repeated steering?
- Is there a better source of truth?
- Can someone safely delete or update it later?
If the answer is no, do not feed it to memory. Put it where it belongs or let it die with dignity.
What this resource is not
This is not a universal memory architecture, a benchmark, a vendor ranking, a security policy, legal advice, or a promise that any platform’s memory feature behaves a specific way.
Terms like memory, skills, state, and retrieval vary by tool. Treat this as an operator checklist. If you turn it into a platform-specific tutorial, verify current official docs before naming commands, flags, APIs, or provider behavior.
Use the checklist
Use this checklist before you add “better memory” to your agent.
If the problem is actually stale project state, missing source retrieval, or a procedure trapped in chat, buying more mythology will not fix it.
Find which monster bit you. Then put the fact where it belongs.