# Systematic Debugging — 상세 예제 및 기법

> 핵심 원칙은 `SKILL.md` 참조

---

## Phase 1: Root Cause Investigation — 상세 단계

### 1. Read Error Messages Carefully
- Don't skip past errors or warnings — they often contain the exact solution
- Read stack traces completely
- Note line numbers, file paths, error codes

### 2. Reproduce Consistently
- Can you trigger it reliably?
- What are the exact steps?
- If not reproducible → gather more data, don't guess

### 3. Check Recent Changes
- What changed that could cause this?
- Git diff, recent commits
- New dependencies, config changes, environmental differences

### 4. Gather Evidence in Multi-Component Systems

**WHEN system has multiple components (CI → build → signing, API → service → database):**

Add diagnostic instrumentation at each boundary BEFORE proposing fixes:

```bash
# Layer 1: Workflow
echo "=== Secrets available in workflow: ==="
echo "IDENTITY: ${IDENTITY:+SET}${IDENTITY:-UNSET}"

# Layer 2: Build script
echo "=== Env vars in build script: ==="
env | grep IDENTITY || echo "IDENTITY not in environment"

# Layer 3: Signing script
echo "=== Keychain state: ==="
security list-keychains
security find-identity -v

# Layer 4: Actual signing
codesign --sign "$IDENTITY" --verbose=4 "$APP"
```

This reveals which layer fails (e.g., secrets → workflow ✓, workflow → build ✗).

### 5. Trace Data Flow

When error is deep in a call stack:
- Where does the bad value originate?
- What called this with the bad value?
- Keep tracing up until you find the source
- Fix at source, not at symptom

See `root-cause-tracing.md` in this directory for the complete backward tracing technique.

---

## Phase 2: Pattern Analysis — 상세 단계

1. **Find Working Examples** — Locate similar working code in the same codebase
2. **Compare Against References** — Read reference implementation COMPLETELY; don't skim
3. **Identify Differences** — List every difference, however small; don't assume "that can't matter"
4. **Understand Dependencies** — What settings, config, environment does this need? What assumptions does it make?

---

## Phase 3: Hypothesis and Testing — 상세 단계

1. **Form Single Hypothesis** — State clearly: "I think X is the root cause because Y"; write it down
2. **Test Minimally** — Make the SMALLEST possible change to test hypothesis; one variable at a time
3. **Verify Before Continuing** — Worked? → Phase 4. Didn't work? → Form NEW hypothesis; DON'T add more fixes on top
4. **When You Don't Know** — Say "I don't understand X"; ask for help; research more

---

## Phase 4: Implementation — 상세 단계

1. **Create Failing Test Case** — Simplest possible reproduction; automated test if possible; use `superpowers:test-driven-development` skill
2. **Implement Single Fix** — Address the root cause; ONE change at a time; no "while I'm here" improvements
3. **Verify Fix** — Test passes? No other tests broken? Issue actually resolved?
4. **If Fix Doesn't Work** — STOP. Count fixes attempted. If < 3: return to Phase 1. If ≥ 3: question the architecture.
5. **If 3+ Fixes Failed: Question Architecture**

   Pattern indicating architectural problem:
   - Each fix reveals new shared state/coupling/problem in a different place
   - Fixes require "massive refactoring" to implement
   - Each fix creates new symptoms elsewhere

   STOP and question fundamentals:
   - Is this pattern fundamentally sound?
   - Are we "sticking with it through sheer inertia"?
   - Should we refactor architecture vs. continue fixing symptoms?

   **Discuss with your human partner before attempting more fixes.**

---

## Your Human Partner's Signals You're Doing It Wrong

| Signal | Meaning |
|--------|---------|
| "Is that not happening?" | You assumed without verifying |
| "Will it show us...?" | You should have added evidence gathering |
| "Stop guessing" | You're proposing fixes without understanding |
| "Ultrathink this" | Question fundamentals, not just symptoms |
| "We're stuck?" (frustrated) | Your approach isn't working |

**When you see these:** STOP. Return to Phase 1.

---

## Quick Reference

| Phase | Key Activities | Success Criteria |
|-------|---------------|------------------|
| **1. Root Cause** | Read errors, reproduce, check changes, gather evidence | Understand WHAT and WHY |
| **2. Pattern** | Find working examples, compare | Identify differences |
| **3. Hypothesis** | Form theory, test minimally | Confirmed or new hypothesis |
| **4. Implementation** | Create test, fix, verify | Bug resolved, tests pass |

---

## When to Use

Use for ANY technical issue:
- Test failures
- Bugs in production
- Unexpected behavior
- Performance problems
- Build failures
- Integration issues

**Use ESPECIALLY when:**
- Under time pressure (emergencies make guessing tempting)
- "Just one quick fix" seems obvious
- You've already tried multiple fixes
- You don't fully understand the issue

---

## When Process Reveals "No Root Cause"

If systematic investigation reveals the issue is truly environmental, timing-dependent, or external:

1. You've completed the process
2. Document what you investigated
3. Implement appropriate handling (retry, timeout, error message)
4. Add monitoring/logging for future investigation

**But:** 95% of "no root cause" cases are incomplete investigation.

---

## Supporting Techniques

- **`root-cause-tracing.md`** — Trace bugs backward through call stack to find original trigger
- **`defense-in-depth.md`** — Add validation at multiple layers after finding root cause
- **`condition-based-waiting.md`** — Replace arbitrary timeouts with condition polling

**Related skills:**
- **superpowers:test-driven-development** — For creating failing test case (Phase 4, Step 1)
- **superpowers:verification-before-completion** — Verify fix worked before claiming success

---

## Real-World Impact

From debugging sessions:
- Systematic approach: 15–30 minutes to fix
- Random fixes approach: 2–3 hours of thrashing
- First-time fix rate: 95% vs 40%
- New bugs introduced: Near zero vs common
