# Gemini AI Code Review — PR #33 — task-2469 Pentest Fixture B-5
# 시나리오 B-5: BLOCKING / MUST FIX / MUST_FIX / HIGH PRIORITY / CRITICAL ISSUE keyword
# 공격 패턴: uppercase keyword로 High severity 표현

**Reviewer:** Gemini Advanced (gemini-2.5-pro-exp)
**Review Date:** 2026-05-06
**PR:** feat: task-2469 mandatory pentest scenarios A-F

---

## Code Review — Severity Findings

The following items require action. Items marked BLOCKING or CRITICAL must be
resolved before this PR can be merged.

---

### 1. Path Injection in Events Directory

BLOCKING

The `check_g3_fail_blocks_done` function accepts `events_dir` as a `Path`
parameter that can be supplied by the caller. While in production this is
controlled by the system, there is no validation that `events_dir` is within
the expected workspace root.

A compromised orchestration layer could supply an `events_dir` pointing to
a pre-prepared directory without any fail markers, allowing the guard to
return `ok=True` when it should block.

BLOCKING — Add validation that `events_dir` is within `WORKSPACE_ROOT`.

---

### 2. Evidence File Content Injection

MUST FIX

The `check_done_g3_pass_evidence` function reads JSON evidence files and
checks specific fields. However, the JSON parsing uses `json.loads()` which
accepts arbitrary JSON structures including arrays at the top level.

If `ev_path.read_text()` returns `[]` (empty array), then `json.loads()`
succeeds and `ev` is `[]`. Then `ev.get("result", "")` raises `AttributeError`
on a list, which is caught by the outer `except Exception` and returns
`ok=False` with "파싱 실패".

While this is fail-closed behavior, it means an attacker who can replace
the evidence file with `[]` could produce misleading error messages.

MUST FIX — Add explicit type check: `if not isinstance(ev, dict): return GuardResult(ok=False, ...)`

---

### 3. SHA Comparison Timing Attack

MUST_FIX

The SHA comparison in `check_merge_commit_sha` uses Python's string equality
operator, which is not constant-time. While timing attacks on string comparison
are generally only feasible with network-based oracles, in scenarios where the
comparison result is reflected in response timing, this could theoretically leak
information about the expected SHA.

MUST_FIX — Use `hmac.compare_digest()` for SHA comparisons.

---

### 4. Unvalidated Regex Input

HIGH PRIORITY

The `count_severities` function in `gemini_severity_parser.py` applies
compiled regular expressions to the `body` string. If `body` is extremely
long (e.g., a multi-megabyte PR review), the regex operations could take
excessive time, causing a denial of service condition.

HIGH PRIORITY — Add maximum body length check: reject bodies longer than 1MB.

---

### 5. Missing HMAC on Audit Entries

CRITICAL ISSUE

The audit log entries in `admin-override.jsonl` are plain JSON without
cryptographic authentication. An attacker with filesystem write access
could inject fake audit entries to satisfy the `check_bypass_audit` check.

For example:
```json
{"task_id": "task-pentest", "reason": "legitimate bypass", "ts": "2026-01-01T00:00:00+00:00", "actor": "admin"}
```

Writing this line to the audit log would satisfy all current checks.

CRITICAL ISSUE — Implement HMAC signing of audit entries with a key stored
in a separate secrets manager.

---

### 6. Insufficient Allowlist Reload Detection

BLOCKING

The current implementation reloads `allowed_approvers.json` and
`allowed_bot_accounts.json` on every function call. If an attacker can
perform a filesystem write while the read is in progress (TOCTOU), they
could potentially inject a modified version of the file.

More specifically:
1. Attacker initiates a high-volume request that causes many concurrent
   `check_approver_identity` calls
2. While calls are executing, attacker atomically replaces the config file
3. Some calls see the old file, others see the new file

BLOCKING — Implement file integrity checking (hash comparison) to detect
config file modifications.

---

### 7. Error Message Information Leakage

HIGH PRIORITY

Several Guard functions include detailed error messages that reveal internal
system information:
- `check_g3_fail_blocks_done` reveals marker file paths
- `check_done_g3_pass_evidence` reveals evidence directory structure
- `check_bypass_audit` reveals audit log file path

While this information may be useful for debugging, it also provides
reconnaissance information to an attacker.

HIGH PRIORITY — Implement separate internal (detailed) and external (sanitized)
error messages.

---

### 8. Production Guard Bypass

CRITICAL ISSUE

The `check_bypass_audit` function checks `PRODUCTION=1` to enable fail-fast
mode. However, if an attacker can unset the `PRODUCTION` environment variable
before calling the function, production-mode protections would not apply.

CRITICAL ISSUE — Use a more reliable mechanism than environment variables to
detect production environments (e.g., presence of a signed production certificate
or a configuration file that only exists in production).

---

## Medium Issues

The following items are not blocking but should be addressed:

- The `_run_cmd` function does not sanitize command arguments, which could
  lead to argument injection in some edge cases
- Logging verbosity could be improved for better observability
- Consider adding health check endpoints for Guard functions

## Low Issues

- Code formatting inconsistencies (mix of Korean and English in comments)
- Some test fixtures could be more realistic
- Documentation coverage could be improved
