Workflow Rescue: Is Your Broken Automation Actually Diagnosable?
A self-triage checklist for people whose workflows stopped working, started lying, or went silently wrong — and who want to know whether the problem is bounded enough to classify before touching anything live.
What this is (and what it isn't)
This is an educational self-assessment tool. It helps you figure out whether your automation problem can be understood from a description alone — or whether it requires live system access, credentials, or production data that no stranger (human or agent) should be touching without proper gates.
This is not a service, an intake form, a consultation offer, or a lead magnet. There is no one waiting on the other end to receive your answers. Use it to think clearly about your own problem before you decide what to do next.
When to use this
You built or inherited an automation workflow. It used to work. Now it doesn't — or worse, it appears to work but produces garbage, skips steps, or silently fails. You want to know:
- Can someone (or something) diagnose this from a description and public documentation alone?
- Or does this problem require hands-on access to production systems to even understand what's wrong?
If the answer is the second one, that's not a failure of diagnosis — it's a signal that the problem shape demands a different approach.
Step 1: Name your failure
Most broken automations fall into one of six categories. Find yours:
| Category | What it looks like | Classic symptom |
|---|---|---|
| Silent Trigger | Workflow should fire but doesn't. No error anywhere. | You find out from a customer complaint, not a log entry. |
| Swallowed Error | Failures are caught and replaced with generic messages or empty states. | System reports "success" but output is missing, blank, or corrupted. |
| Credential Propagation | A secret, token, or key stopped resolving in one place but works elsewhere. | "Credential not found" errors after an upgrade, rotation, or config change. |
| Browser Automation Crash | A headless browser hangs, runs out of memory, or enters a zombie state. | Task shows "running" for hours or days. No output. No error. |
| Governance Gap | An agent or automation can do things it shouldn't, with no approval step. | You discover the agent visited a page, submitted a form, or took an action nobody authorized. |
| Provider Mismatch | A feature is configured against a provider that doesn't support it. | Cryptic API errors that take hours to trace back to a capability gap. |
If your problem doesn't fit any of these, write down what it is before moving on. "I don't know" is a valid answer — it just means you need more investigation before classification is possible.
Step 2: Answer the 12 self-triage questions
Go through each question honestly. If you can't answer one, mark it "can't answer yet" — don't invent data to fill the gap.
| # | Question | What you're really checking |
|---|---|---|
| Q1 | What should this workflow produce when it works? | Can you state the intended outcome in one sentence? |
| Q2 | Which failure category does this fall into? | Can you name the problem type from Step 1? |
| Q3 | What specifically breaks, and what's the visible symptom? | Can you separate the symptom from the root cause? |
| Q4 | What tool categories are involved? | Name them by type (email, chat, CRM, automation runner, agent shell, provider dashboard) — never by account name or ID. |
| Q5 | Is this in a regulated domain? | Financial, medical, legal, or government systems need specialist review, not a generic diagnostic. |
| Q6 | Can you show a safe sample of the problem? | Fake field names, synthetic data, redacted examples — enough to illustrate the failure without exposing real information. |
| Q7 | What evidence do you have right now? | Error messages, public changelogs, open-source code, forum posts, release notes. Evidence that exists without touching production. |
| Q8 | Where does a human currently review the output? | Is there a human in the loop? Where? How often? |
| Q9 | What's the risk if this gets misdiagnosed? | Low (inconvenience), medium (missed sends), or high (financial, legal, customer-impacting)? |
| Q10 | What would proof of understanding look like? | A process map? A failure taxonomy? A risk register? Name the artifact. |
| Q11 | What access would someone need to investigate further? | If the answer is "none — a description plus public docs is enough," you're in good shape. |
| Q12 | What's explicitly out of scope? | What should nobody touch, change, or deploy as part of a diagnostic exercise? |
Step 3: Score yourself
Use this simple rubric to assess whether your problem is ready for structured classification:
FIT signals (problem is likely classifiable from description alone)
- You answered 10+ questions clearly
- You have at least one piece of non-production evidence (Q7)
- You can show a safe sample (Q6)
- No regulated domain (Q5 = no)
- The risk of misdiagnosis is low or medium (Q9)
- No additional access is needed beyond what you can describe (Q11 = none)
NEEDS MORE INVESTIGATION signals
- You answered 7–9 questions but several say "can't answer yet"
- Evidence exists but is environment-dependent ("works on my machine")
- The problem is real and bounded, but you'd need to run it live to get diagnostic data
NOT READY signals (stop and address these first)
- You can't describe the problem without sharing credentials, tokens, or passwords
- Diagnosis requires reading live logs, inspecting running processes, or querying production databases
- The agent/automation acts without any human approval step
- The workflow involves financial transactions, customer communications, or irreversible operations with no documented gates
- You can't capture the problem shape without exposing customer records, personal data, or proprietary business logic
Step 4: Check the red flags
If any of these apply, the problem is not ready for remote classification — regardless of how clearly you can describe it.
See the companion red-flags-table.md for the full breakdown. Here's the short version:
- Credential demand — You can't describe the problem without sharing tokens, passwords, API keys, session cookies, or private certificates.
- Production access requirement — Someone would need to read live logs, inspect running processes, or query production databases to understand what's happening.
- Unsupervised action expectation — You expect the diagnostician (human or agent) to send, spend, delete, modify accounts, or take client-impacting actions without human approval gates.
- High-impact automation without gates — The workflow involves financial transactions, customer communications, legal commitments, or irreversible operations without documented approval steps and rollback plans.
- Private data dependency — You cannot capture the problem shape without exposing customer records, personal data, internal communications, or proprietary business logic.
- Service commitment territory — The conversation turns to availability guarantees, response times, SLAs, ongoing support, or retainer arrangements. That's a different kind of conversation.
If zero red flags fire and your self-score is solid, you have a well-bounded problem that's ready for structured analysis. If one or more fire, that's not a failure — it's useful information about what kind of help you actually need.
Step 5: Prepare your proof artifacts
If you're going to ask someone (or something) to help you think through this, prepare these first. They make the problem legible without exposing anything sensitive:
| Artifact | What it shows | Format |
|---|---|---|
| Problem description | What should happen, what breaks, what the visible symptom is | One page of plain text |
| Tool stack map | Categories of tools involved (never account names or IDs) | Simple list or table |
| Safe sample | A synthetic/redacted example illustrating the failure | Markdown with fake data |
| Evidence inventory | What you already know from public sources, logs, or documentation | Annotated list with links |
| Process map | Where in the workflow the failure occurs | ASCII diagram, Mermaid, or hand-drawn photo |
| Blocker list | What must exist before any real implementation could start | Bullet list |
Step 6: Know what NOT to send
Before you share your problem description with anyone — a consultant, a community forum, an AI agent, a colleague — check that you haven't included:
- Real credentials — tokens, passwords, API keys, session cookies, certificates
- Production logs — live error traces with real request IDs, user data, or IP addresses
- Customer data — names, emails, phone numbers, order details, support tickets
- Account identifiers — specific account names, organization IDs, billing information
- Internal URLs — admin dashboards, staging environments, private API endpoints
- Proprietary logic — business rules, pricing algorithms, competitive intelligence that you wouldn't want public
If you can't describe the problem without these, the problem isn't ready for external classification. That's not a blocker — it's a boundary.
What "fit" actually means
A problem that's "fit" for structured classification is one where:
- The failure can be named and categorized from description alone
- Evidence exists in public or synthetic form
- No production access is needed to understand the problem shape
- The output is a diagnostic document, not a deployed fix
- The risk of working on it is contained
A problem that's "not fit" isn't broken or unimportant — it just needs a different format. Maybe it needs hands-on access. Maybe it needs a specialist. Maybe it needs governance work before anyone should touch it. Knowing which one you have saves everyone time.
What to do with your results
- FIT: Your problem is well-bounded. Write it up using the proof artifacts above. You now have a clear description you can use to get useful input from communities, documentation, or diagnostic tools.
- NEEDS MORE INVESTIGATION: Go gather the missing evidence. Run the workflow once more with logging enabled. Capture the error. Note the environment. Then come back to Step 2.
- NOT READY: Address the red flags first. If the problem requires credentials, set up a sandbox. If it requires production access, find someone with authorized access. If it involves unsupervised agent action, add approval gates before doing anything else.
This checklist is educational material about automation workflow diagnostics. It does not constitute a service offer, consultation, or commitment to fix anything. Use it to think clearly about your own problems before deciding what kind of help you need.