# Gemini AI Code Review — PR #33 — task-2469 Pentest Fixture B-3
# 시나리오 B-3: 🔴 🚨 ⛔ ⚠️ ❌ emoji 모두 포함한 High severity 표현
# 공격 패턴: emoji prefix로 severity 표시

**Reviewer:** Gemini Advanced (gemini-2.5-pro-exp)
**Review Date:** 2026-05-06
**PR:** feat: task-2469 penetration test — lifecycle guards

---

## Automated Code Review Results

All findings are marked with emoji severity indicators following the
Gemini review standard:

- 🔴 = High / Critical
- 🚨 = Critical (immediate action required)
- ⛔ = Blocking (must fix before merge)
- ⚠️ = Warning / Medium concern
- ❌ = Must fix / Blocking error

---

## Security Findings

🔴 **Symlink Attack on Events Directory**

The `check_g3_fail_blocks_done` guard iterates over marker files in the
events directory using `Path.exists()`. On Linux/macOS, `Path.exists()`
follows symbolic links. An attacker with write access to the filesystem
could:

1. Create a symlink at `/events/task-pentest.g3-fail` pointing to `/dev/null`
2. `p.exists()` returns `True` (symlink exists), so the guard correctly blocks
3. However, if the symlink points to a path that does NOT exist, `p.exists()`
   returns `False` even though the symlink file itself is present

This allows an attacker to create a dangling symlink that makes the guard
think no fail marker exists.

---

🚨 **Critical: Evidence File Race Condition**

The `check_done_g3_pass_evidence` function reads the evidence file in a
two-step process: first checking if the file exists via `_g3_pass_evidence_paths`,
then reading and parsing it. Between these two steps, another process could
replace the file with a crafted version.

This is a classic TOCTOU race condition:
```
Thread A: finds evidence file (valid)
Thread B: replaces evidence file with crafted version
Thread A: reads crafted evidence file (bypassed)
```

🚨 Fix by using atomic read with file locking.

---

⛔ **Blocking: Audit Log File Not Locked**

The `append_admin_override_audit` function opens the audit log in append mode
without acquiring a file lock. In concurrent execution scenarios, two processes
could simultaneously write to the log, causing interleaved JSON entries that
are not valid JSON lines. The `check_bypass_audit` function would then fail to
parse these entries and report "no task entries found", incorrectly failing.

This is a **blocking** issue for multi-agent environments.

---

⚠️ **Warning: SHA Truncation in Error Messages**

The `check_merge_commit_sha` function truncates SHAs to 12 characters in
error messages (`:12` slice). While this is fine for display purposes,
it could cause confusion in automated tooling that parses the error messages
and attempts to use the truncated SHA for further operations.

---

❌ **Must Fix: Missing Null Check in Evidence SHA**

In `check_done_g3_pass_evidence`, the SHA lookup uses a chain of `.get()` calls:
```python
ev_sha = ev.get("sha") or ev.get("head_sha") or ev.get("merge_commit_sha") or ""
```

If `ev.get("sha")` returns the string `"null"` (a JSON string, not Python None),
the `or` chain short-circuits and `ev_sha = "null"`. Then `ev_sha != head_sha`
would be `True` (since `head_sha` is a real SHA), correctly blocking the check.

However, if `head_sha` is somehow also `"null"`, the check would pass.
This edge case should be explicitly tested.

❌ Add explicit check: `if ev_sha in ("", "null", None): return GuardResult(ok=False, ...)`

---

🔴 **High: Insufficient Audit Entry Validation**

The `check_bypass_audit` function validates that an audit entry with the
matching `task_id` exists, but does not validate:
- The `reason` field is non-empty
- The `ts` field is a valid ISO8601 timestamp
- The `actor` field is non-empty (not just "unknown")

An attacker could create a minimal valid JSON entry: `{"task_id": "target-task"}`
and this would satisfy the check.

---

🚨 **Critical: No Cryptographic Integrity on Evidence Files**

The entire evidence validation chain relies on file content matching but
has no cryptographic signing. An attacker with filesystem write access
could modify any evidence file to contain valid-looking content without
being detected.

True tamper-evidence would require:
- Evidence files signed with a private key
- Signature verified against a public key embedded in the Guard
- Or evidence files stored in an append-only audit log with hashing

---

⛔ **Blocking: `check_approver_identity` Allowlist Case Sensitivity**

The approver identity check uses direct string comparison:
```python
if approver_login in allowed:
```

GitHub usernames are case-insensitive (GitHub treats `TaskCtl-Gate` and
`taskctl-gate` as the same user), but the Guard uses case-sensitive comparison.
If the allowlist contains `taskctl-gate` but the PR approval comes from
`TaskCtl-Gate`, the check would incorrectly fail.

⛔ Fix: normalize both `approver_login` and allowlist entries to lowercase
before comparison.

---

## Medium Findings

⚠️ The `count_severities` function in `gemini_severity_parser.py` does not
limit the output list sizes. A very long review body with thousands of
severity markers could cause memory issues.

⚠️ The `parse_pr_review_severities` function makes three separate API calls
(reviews, review_comments, issue_comments). Consider batching these calls.

## Low Findings

- 🟡 Minor: unused import `warnings` in `lifecycle_guards.py` (actually used, false alarm)
- 🟡 Minor: `_now_iso()` could use a more explicit format string
