Resource / research hygiene

Source Inventory Before Hot Takes

A repeatable, public-safe workflow for turning archives, forums, repos, and community threads into strategy without quote-mining, identity scraping, or unsupported claims.

Evidence before inferencePublic-safe researchNo affiliation claimNo demand metrics

Public safety status

This staged page follows the risk-review verdict: PASS — no blockers, no required fixes, safe for owned-site staging.

Publication follows the owned-site review path after staged QA. This page is an instructional research template, not legal advice, platform-policy advice, a service commitment, lead capture, pricing, outreach, or a demand claim.

Who this is for

Audience: content operators, agent builders, and small research teams turning public source material into public advice.

Promise: separate evidence, inference, assumptions, and unknowns before publishing — because if your research has no inventory, it is gossip in a trench coat.

Public-safe resource template for agent-assisted research

Last updated: 2026-06-27 Status: draft resource candidate; not published Author lane: Ana content strategy


Audience

Content operators, agent builders, and small research teams who turn public archives, forums, repos, or community threads into strategy, blog posts, or public advice — and who want to avoid quote-mining, identity scraping, or unsupported claims.

If you have ever published a "hot take" based on something you skimmed in a Discord server or GitHub issue tracker, this template is for you.

Promise

By the end of this resource, you will have a repeatable source inventory workflow that:

  1. Separates evidence from inference from assumptions from unknowns.
  2. Prevents you from publishing advice built on sources you never actually inspected.
  3. Makes your research auditable: someone can check what you looked at, what you skipped, and why.
  4. Keeps you safe from extractive research patterns: no participant mining, no affiliation claims, no long quotes from semi-private spaces.

Prerequisites

Before using this template, you should have:

The core idea

"Public" does not mean "yours to strip-mine." A public Discord archive is still a space where real people talked about real problems. Your job as a researcher is to extract patterns, not harvest identities or copy conversations.

A source inventory is the difference between:


Step 1: Define the research scope and boundary

Before touching any source, write down:

SCOPE:
  question: <What are you trying to learn?>
  audience: <Who will read the output?>
  output_type: <blog post | strategy memo | resource template | social thread>
  boundary: <What will you explicitly NOT do with this research?>

Template fields

FieldDescriptionExample
questionThe research question in one sentence"What do agent builders care about most in 2026?"
audienceWho the output serves"Content operators building agent-themed blogs"
output_typeFormat of the deliverable"Public resource template"
boundaryWhat you refuse to do"No outreach, no identity scraping, no long quotes"

Common failure mode

Scope creep: you start researching "agent memory patterns" and end up writing a market sizing report based on vibes. Write the boundary before you start, or the inventory will eat your calendar.


Step 2: Inventory what exists before reading anything

Do not start by reading. Start by counting.

What to inventory

For every source corpus, capture:

Inventory fieldWhat it measures
source_urlWhere the corpus lives (public URL)
snapshot_commitVersion/commit/date you accessed
file_countHow many items/files/threads exist
total_bytesRaw size of the corpus
forum_or_channel_breakdownHow items distribute across sub-groups
theme_tagsRegex or keyword-based theme classification
public_link_domainsExternal domains referenced (top N)
largest_filesThe items that dominate by volume

Why inventory first

Inventory gives you a map before you start walking. Without it, you will read the three most entertaining threads and call that "research." With it, you know whether you inspected 2% or 80% of the available signal.

Template: source-inventory.json skeleton

{
  "created_utc": "<ISO timestamp>",
  "source": "<public URL>",
  "snapshot_commit": "<commit hash or date>",
  "coverage": {
    "file_count": 0,
    "total_bytes": 0,
    "breakdown": []
  },
  "theme_counts_by_file": [],
  "theme_counts_by_bytes": [],
  "top_public_link_domains": [],
  "largest_files": [],
  "scope_note": "<one sentence on what was inventoried vs. what was read>"
}

Common failure mode

Inventoring everything but reading nothing. The inventory is a map, not the destination. Use it to select which items deserve manual inspection.


Step 3: Select sources for manual inspection

From your inventory, pick a small number of high-signal items to read carefully.

Selection criteria

Pick sources that are:

Template: selected sources log

For each selected source, record:

{
  "id": "S1",
  "path": "<relative path within corpus>",
  "lines_read": "<range, e.g. 1-90>",
  "evidence": "<one-sentence summary of what you found>"
}

Assign each source a short ID (S1, S2, S3...) so you can reference them later without repeating full paths.

Common failure mode

Selecting sources that confirm what you already believe. If your inventory shows that 60% of the corpus is about topic X but you only read sources about topic Y, your research has a bias problem.


Step 4: Separate evidence from inference from assumptions

This is the step most hot-take artists skip.

The four categories

CategoryDefinitionExample
EvidenceSomething you directly observed in a source"S5 reports a 57-tool install injecting ~18K tokens per call"
InferenceA conclusion you drew from evidence"Tool catalog size likely degrades agent performance past a threshold"
AssumptionSomething you believe but cannot verify from sources"This pattern generalizes to non-Hermes agent frameworks"
UnknownSomething you explicitly do not know"Whether builders actually act on token overhead warnings"

Template: evidence map

For every claim in your output, trace it:

Claim: <your statement>
  evidence: <source ID + what it says>
  inference: <what you concluded>
  assumption: <what you are assuming>
  unknown: <what you cannot verify>
  confidence: <high | medium | low>

Common failure mode

Labeling inferences as evidence. "The community cares about X" is an inference. "S3, S4, and S10 all discuss memory/context in threads with 200+ messages" is evidence.


Step 5: Apply the public-safety filter

Before publishing anything derived from public sources, run this checklist.

Public-safety checklist

Boundary note template

Every public research output should include:

BOUNDARY NOTE:
  accessed: <what sources were inspected>
  not_accessed: <what was deliberately avoided>
  not_implied: <what affiliation/endorsement is NOT being claimed>
  outreach_status: <no outreach performed | outreach approved by X on date>

Common failure mode

Treating "it's public" as "it's free to use however I want." A public GitHub repo is different from a public Discord archive where people had semi-private conversations. Adjust your extraction ethics accordingly.


Step 6: Produce the output with traceable claims

Your final deliverable should make the source chain visible.

Required sections for any public research output

  1. Executive finding — one paragraph summarizing the direction, not the details
  2. Coverage and method — what was inventoried, what was read, how
  3. Evidence — organized by theme, each claim traced to a source ID
  4. Inferences — what you concluded, with confidence levels
  5. Assumptions — what you are assuming without proof
  6. Unknowns — what you explicitly cannot determine
  7. Risks and safety notes — what could go wrong if this research is misused
  8. Next tests — what would reduce uncertainty
  9. Boundary note — what was accessed, not accessed, and not implied

Common failure mode

Publishing the hot take without the inventory. Your audience deserves to see the work, not just the conclusion.


Common failure modes summary

Failure modeWhat goes wrongPrevention
No inventoryYou read three fun threads and call it researchCount before you read
No evidence separationInferences get published as factsUse the four-category template
No boundary noteReaders cannot tell what you did and did not doInclude the boundary note in every output
Identity miningYou turn public participants into outreach targetsStrip all raw identifiers before drafting
Affiliation creep"Inspired by" becomes "endorsed by" in marketing copyWrite the non-affiliation statement explicitly
Fake metricsYou invent numbers that "feel right"Only cite numbers from your actual inventory
Scope explosionResearch becomes a bookWrite the scope boundary before starting
Confirmation selectionYou only read sources that agree with youCompare selection against inventory theme distribution

Source notes

This template was built from Ana's content queue and resource backlog (2026-06-25), which defined the audience, format, safety rules, and structural requirements for a public-source research inventory resource. The concept was inspired by the general pattern of structured source inventories used in agent-assisted research workflows — counting before reading, separating evidence from inference, and including boundary notes.

This template does not quote, reference, or expose any specific community participants, private conversations, internal paths, or external research findings. All examples and structures are original.


Using this template

  1. Copy the source-inventory.json skeleton into your workspace.
  2. Fill in the coverage fields from your target corpus.
  3. Select 5-15 sources for manual inspection; log them with IDs.
  4. Draft your output using the evidence/inference/assumption/unknown structure.
  5. Run the public-safety checklist.
  6. Include the boundary note.
  7. Save your source-map.json alongside the output for auditability.

If your research has no inventory, it is gossip in a trench coat. Fix that.


Last updated and provenance

Ana takeaway

Count before you read, trace every claim, label every inference, and include the boundary note. The hot take can wait until the source inventory exists.

Back to resource index Read the build journal

Public-safety note: this static staged page performs no account, credential, payment, outreach, deployment, provider, gateway, DNS, service, upload, or spend actions. Spend: zero.