VERDICT — Architecture¶
Architecture diagram with trust boundaries, distinguishing prompt-based guardrails from architectural guardrails.
This document is the single-page visual summary operators reach first. It is the public architecture source for VERDICT's seven-layer product shape, credential modes, and Claude Code primary-interface model.
Architectural pattern claimed (under Amendment A2)¶
VERDICT combines two architectural patterns:
- Direct Agent Extension — Claude Code IS the agent. The operator runs
scripts/verdict <evidence>for the one-shot path, orclaude/scripts/find-evilat the repo root for interactive exploration;.mcp.jsonauto-spawns both MCP servers; Claude Code drives the investigation as supervisor + Pool A/B subagents (native Task mechanism — notCLAUDE_CODE_FORK_SUBAGENT, which is a build-time internal and is not used in this product). - Custom MCP Server — two purpose-built MCP servers expose the typed tool surface:
findevil-mcp(Rust) — 31 DFIR primitives (core Windows memory/disk/log/network verbs plus allow-listed long-tail wrappers such asvol_run,ez_parse,plaso_parse,mac_triage, andcloud_audit). Read-only on evidence; SHA-256 every output. NOexecute_shell.findevil-agent-mcp(Python) — 12 crypto + ACH + memory + ACP + expert-feedback tools (audit_append/verify, manifest_finalize/verify, verify_finding, detect_contradictions, judge_findings, correlate_findings, memory_remember/recall, pool_handoff, expert_miss_capture). The pre-A5ots_stamp/ots_verifypair was removed.
The combination is the architectural claim: Claude Code's agent loop never touches a raw shell because the only verbs it has are MCP-typed function calls into one of the two servers.
Maturity note. The 31 Rust verbs are implemented as a typed, allow-listed surface. The
long-tail verbs vol_run, ez_parse, plaso_parse, mac_triage, cloud_audit,
journalctl_query, login_accounting, ausearch, nfdump_query, suricata_eve, and
indx_parse are fixture-tested but not yet exercised on real evidence in a committed run; the
committed sample runs prove the core disk/registry/EVTX/MFT/Prefetch/YARA/USN/Hayabusa/Sysmon/
Zeek/PCAP, vol_*, vel_collect, and browser_history paths.
Relationship to Protocol SIFT¶
VERDICT runs on the same SANS SIFT VM (sift-2026.03.24.ova) that Protocol SIFT operates on — they are not in conflict.
Deliberate divergence in the MCP surface:
| Aspect | VERDICT | Protocol SIFT gateway |
|---|---|---|
| Product MCP servers | 2 typed, audit-chained servers (findevil-mcp, findevil-agent-mcp); .mcp.json registers 6 servers total including 4 non-product operator conveniences |
1 gateway (200+ shell-backed tools) |
| Tool count | 43 (31 Rust DFIR + 12 Python crypto/ACH/memory/ACP/expert) | 200+ (dynamic, shell coverage) |
| Shell surface | None — NO execute_shell |
Broad — gateway is a shell pass-through |
| Use case | Repeatable DFIR mechanics for evidence investigation | General-purpose bot connectivity |
| Installation | No conflicts — separate MCP registrations | protocol-sift install installs the gateway independently |
After protocol-sift install on a SIFT VM, both VERDICT's narrow typed surface and Protocol SIFT's broad shell-backed gateway coexist. Operators choose which agent interface to use per investigation; neither requires nor conflicts with the other.
The narrow surface is intentional: it reduces the attack surface from "full shell access" to 31 named Rust DFIR operations and 12 Python cryptographic/ACH/memory/ACP/expert operations, enabling an architectural argument that the agent loop never touches shell primitives directly — all actions flow through typed JSON-RPC schema validation.
Runtime architecture (the Product that operators run)¶
flowchart TB
subgraph Trust0["**TRUST BOUNDARY 0** — Evidence Vault (read-only)"]
Evidence["/evidence/case-id/<br/>Original .e01<br/>SHA-256 verified<br/>chmod 444 / mount -o ro"]
end
subgraph Trust1["**TRUST BOUNDARY 1** — SIFT Tool Subprocesses (unprivileged, sandboxed)"]
Hayabusa[Hayabusa<br/>AGPL-3.0<br/>subprocess]
Chainsaw[Chainsaw v2<br/>GPL-2.0<br/>subprocess]
Volatility[Volatility3<br/>AGPL-3.0<br/>subprocess]
Velociraptor[Velociraptor<br/>AGPL-3.0<br/>gRPC subprocess]
YARA[YARA + Forge Core<br/>subprocess scan]
end
subgraph Trust2["**TRUST BOUNDARY 2** — Two MCP Servers (typed tool surface)"]
RustMcp["**findevil-mcp** (Rust, hand-rolled MCP 2024-11-05)<br/>31 typed DFIR tools<br/>NO execute_shell<br/>---<br/>core Windows memory/disk/log/network verbs<br/>+ allow-listed long-tail wrappers"]
AgentMcp["**findevil-agent-mcp** (Python, mcp SDK 1.x)<br/>12 typed crypto/ACH/memory/ACP/expert-feedback tools<br/>---<br/>audit_append/verify,<br/>manifest_finalize/verify,<br/>verify_finding,<br/>detect_contradictions,<br/>judge_findings,<br/>correlate_findings,<br/>memory_remember/recall,<br/>pool_handoff,<br/>expert_miss_capture"]
EvtxCrate["evtx crate<br/>MIT, in-process<br/>~1600× python-evtx (upstream)"]
Merkle["hand-rolled Merkle<br/>(rs_merkle-compatible semantics)<br/>append-only tree"]
DuckDB["DuckDB L1 case store<br/>(path reserved, not yet initialized)"]
end
subgraph Trust3["**TRUST BOUNDARY 3** — Claude Code agent loop (A2 — replaces LangGraph)"]
Supervisor["Claude Code main agent<br/>= supervisor<br/>reads agent-config/SOUL.md<br/>+ AGENTS.md + MEMORY.md"]
PoolA["Pool A subagent<br/>(native Task mechanism)<br/>persistence-biased prompt:<br/>Tasks, Services, WMI,<br/>Run, IFEO, LOLBins"]
PoolB["Pool B subagent<br/>(native Task mechanism)<br/>exfil-biased prompt:<br/>net connections, staging,<br/>certutil/bitsadmin, cloud sync,<br/>USB writes"]
Contradiction["detect_contradictions<br/>(MCP tool call into agent_mcp)<br/>FIRES BEFORE JUDGE"]
Judge["judge_findings<br/>credibility-weighted<br/>Estornell ICML 2025"]
Verifier["verify_finding<br/>re-executes cited tool calls<br/>vetos uncited Findings"]
Correlator["correlate_findings<br/>≥2 artifact classes<br/>for execution claims"]
end
subgraph Trust4["**TRUST BOUNDARY 4** — Crypto Custody (M2)"]
SignerTier["Signer tier<br/>Ed25519 default<br/>Sigstore identity tier<br/>stub blocks release"]
AuditJSONL["audit.jsonl<br/>hash-chained, append-only<br/>prev_hash per line"]
Manifest["run.manifest.json<br/>signs Merkle root +<br/>audit-log final hash"]
end
subgraph Trust5["**TRUST BOUNDARY 5** — Presentation"]
Terminal["Claude Code terminal<br/>findings / contradictions /<br/>plans rendered as text<br/>(primary UX under A2)"]
VerdictSh["scripts/verdict<br/>canonical one-shot launcher<br/>preflight + investigate + report"]
AutoEngine["internal automation engine<br/>find-evil-auto<br/>non-interactive by default"]
FindEvilSh["scripts/find-evil<br/>interactive helper<br/>= claude (in cwd)"]
NextJS["Next.js 15 SPA<br/>SSE audit-tail route + debug viewer<br/>local operator aid"]
MCPWidgets["MCP App widgets SEP-1865<br/>(deferred — week-7 bonus)"]
end
Human((Analyst /<br/>Operator)) -->|scripts/verdict| VerdictSh
Human -->|scripts/find-evil| FindEvilSh
Human -->|claude| Terminal
VerdictSh --> AutoEngine
AutoEngine --> Supervisor
FindEvilSh --> Terminal
Terminal --> Supervisor
Evidence -.->|read-only mount| Trust1
Trust1 -->|stdout parsed<br/>subprocess boundary| RustMcp
RustMcp -->|typed JSON-RPC<br/>stdio transport| Trust3
AgentMcp -->|typed JSON-RPC<br/>stdio transport| Trust3
Supervisor --> PoolA
Supervisor --> PoolB
PoolA --> Contradiction
PoolB --> Contradiction
Contradiction -->|ContradictionFound<br/>event surfaced FIRST| Terminal
Contradiction --> Judge
Judge --> Verifier
Verifier --> Correlator
Correlator --> Trust4
Trust3 --> SignerTier
RustMcp -.->|tool output digest<br/>becomes Merkle leaf| Merkle
AgentMcp -->|audit_append| AuditJSONL
AgentMcp -->|manifest_finalize| Manifest
AuditJSONL -->|leaves + final hash| Manifest
Merkle --> Manifest
SignerTier --> Manifest
Human -->|approve / reject<br/>plan + contradictions| Trust3
style Trust0 fill:#e8f5e9,stroke:#2e7d32,stroke-width:3px
style Trust1 fill:#fff3e0,stroke:#ef6c00,stroke-width:2px
style Trust2 fill:#e3f2fd,stroke:#1565c0,stroke-width:3px
style Trust3 fill:#f3e5f5,stroke:#6a1b9a,stroke-width:2px
style Trust4 fill:#fffde7,stroke:#f9a825,stroke-width:3px
style Trust5 fill:#fce4ec,stroke:#ad1457,stroke-width:2px
Trust boundary legend¶
| # | Boundary | Enforcement mechanism | Type |
|---|---|---|---|
| 0 | Evidence vault | Architectural (shipped): originals opened read-only (libewf for .e01); SHA-256 fingerprinted at case_open and re-checked at every verifier replay; no write verb exists anywhere in the 43-tool product surface. Hardened-deployment posture (recommended, not code-enforced): mount -o ro + chmod 444 on the vault, inotifywait write-monitoring |
Code-enforced today; filesystem hardening is operator posture |
| 1 | SIFT tool subprocesses | Architectural (shipped): unprivileged user (no root, no CAP_SYS_ADMIN); fixed-argv invocation — Command::new(bin).args([...]), never sh -c, so a path/arg is never shell-parsed (adversarially pinned by services/mcp/tests/bypass_paths.rs). Roadmap (documented, not yet enforced in code): per-call wall-clock budget, cpulimit, tmpfs work dir, binary allowlist |
Process-enforced today; resource sandboxing is roadmap |
| 2 | Two typed MCP servers | Architectural: Rust findevil-mcp type system forbids execute_shell; Python findevil-agent-mcp Pydantic input models use extra="forbid"; tool surfaces fixed at compile/build time. Adding a shell passthrough would require a code change + PR + review |
Compiler/schema-enforced |
| 3 | Claude Code agent loop | Mixed: agent system prompts (agent-config/SOUL.md — epistemic hierarchy, AGENTS.md — roles) are prompt-based guardrails; verifier veto (no Finding without tool_call_id) is architectural (Pydantic schema-level enforced at the findevil-agent-mcp boundary). Roadmap (tracked, not yet in code): emit in-chain self-correction — a course_correction/re_evaluation audit record citing the triggering tool_call_id when the agent reverses or down-grades a Finding — so analyst-driven revisions are auditable under manifest_verify (#54) |
Mixed — prompt guards behavior, Pydantic guards data |
| 4 | Crypto Custody | Architectural: manifest signing and Merkle root computation happen inside findevil-agent-mcp before any finding is user-visible. Ed25519 is the offline-verifiable default; Sigstore/Rekor is the identity + transparency-log tier; the pre-A5 OpenTimestamps/Bitcoin tier was removed so manifest_finalize is the terminal custody step |
Cryptographic |
| 5 | Presentation | DEFERRED to bonus (A2 §2.1). The terminal IS the primary UX. Optional Next.js SSE bus (when shipped) is read-only from the frontend; --unattended mode logs approved_by: "auto" to the audit chain. |
Auth-enforced (when present) |
Prompt-based vs architectural guardrails — explicit distinction¶
Prompt-based guardrails (prompts that GUIDE behavior):
- agent-config/SOUL.md epistemic hierarchy (CONFIRMED > INFERRED > HYPOTHESIS)
- agent-config/AGENTS.md specialist roles and tool scope
- agent-config/MEMORY.md DFIR artifact semantics (Amcache ≠ execution time, etc.)
- agent-config/HEARTBEAT.md canary string self-check every turn
Prompt guardrails can fail — that is the design assumption, not a surprise; when they do, the
architectural guardrails below must catch the fallout. What is bypass-tested today is the
architectural layer (services/mcp/tests/bypass_paths.rs: shell-payload paths, .. traversal,
flag-looking paths — all inert), plus the HEARTBEAT.md canary as the in-run prompt-injection
tripwire. Dedicated prompt-injection fixtures in goldens/ are planned and not yet shipped —
we say so here rather than claim them.
Architectural guardrails (structural controls that PHYSICALLY PREVENT bad outcomes):
- Read-only evidence access (code-enforced: libewf read-only open, SHA-256 at case_open re-checked at every replay, no write verb in the tool surface; pair with a read-only mount in hardened deployments)
- Typed Rust MCP server (findevil-mcp) with no execute_shell (compiler-enforced; adding shell passthrough requires a code change and PR review)
- Typed Python MCP server (findevil-agent-mcp) with Pydantic extra="forbid" on every input model (boundary-enforced; unknown fields surface as validation errors)
- Pydantic schema on Finding events requires tool_call_id (schema-enforced; unvalidated Findings can't exit the agent_mcp boundary)
- Hash-chained audit.jsonl (prev_hash per line; chain replay catches any backdated/mutated entry)
- manifest signing at the findevil-agent-mcp layer (Ed25519 default; Sigstore/Rekor when configured; explicit stub fallback blocks customer release)
- Merkle tree append-only at the findevil-agent-mcp layer (agent cannot rebuild the tree to favor a different leaf set)
- Sigstore/Rekor transparency-log inclusion proof when that tier is configured (agent cannot forge the signed manifest provenance)
The no-arbitrary-execution claim is machine-checkable in-repo today: the tool registry is fixed
at compile time (services/mcp/src/tools/mod.rs — adding a verb is a code change + review),
scripts/divergence-smoke.py asserts the product MCP servers register no
execute_shell/bash -c-shaped surface, and services/mcp/tests/bypass_paths.rs exercises the
boundary adversarially. (A third-party mcp-scanner pass is on the pre-release checklist; no
scanner artifact ships in this tree yet.)
Credential modes (Amendment A1)¶
The Product (what operators run) detects three credentials in priority order via scripts/install.sh and services/agent/config.py resolve_credentials():
flowchart TD
Start(["install.sh / resolve_credentials()"])
Check1{CLAUDE_CODE_OAUTH_TOKEN<br/>env var set?}
Check2{~/.claude/<br/>interactive session?}
Check3{ANTHROPIC_API_KEY<br/>env var set?}
Mode1[Mode 1:<br/>long-lived token<br/>from claude setup-token<br/>non-interactive<br/>inference-only scope]
Mode2[Mode 2:<br/>interactive session<br/>from claude auth login<br/>dev/demo use]
Mode3[Mode 3:<br/>direct API<br/>from console.anthropic.com<br/>metered, < $1/run]
Fail["FAIL FAST<br/>error message lists<br/>all 3 options"]
Start --> Check1
Check1 -->|yes| Mode1
Check1 -->|no| Check2
Check2 -->|yes| Mode2
Check2 -->|no| Check3
Check3 -->|yes| Mode3
Check3 -->|no| Fail
style Mode1 fill:#c8e6c9
style Mode2 fill:#c8e6c9
style Mode3 fill:#c8e6c9
style Fail fill:#ffcdd2
All three modes are fully supported. Operators pick whichever they already have — none is required to build or run.
Data flow — a single investigation from .e01 to verdict (under A2)¶
- Operator runs
scripts/verdict <evidence>for a one-shot live investigation, orclaude/scripts/find-evilat the repo root for interactive mode. The one-shot launcher performs preflight, starts the optional dashboard unless--no-dashboardis set, and delegates to the internalfind-evil-autoengine. The interactive path uses Claude Code, which reads.mcp.json, spawns both MCP servers, and ingestsCLAUDE.md+agent-config/*as system context. - In interactive mode, the operator prompts: "investigate fixtures/nist-hacking-case/SCHARDT.001". In one-shot mode,
scripts/verdictsupplies the evidence path to the internal engine. The supervisor callscase_open(Rust MCP) — SHA-256 verifies the image, opens via libewf read-only, reserves theevidence.ddbpath at~/.findevil/cases/<id>/evidence.ddb(the DuckDB L1 store is not yet initialized), callsaudit_append(Python MCP) for the open event. - Claude Code emits a plan as text (no
PlanProposedevent needed — the terminal IS the channel) and forks two subagents via the native Task mechanism: one with the Pool A persistence prompt, one with Pool B exfil. - Each pool subagent invokes Rust MCP DFIR tools (
evtx_query,mft_timeline,hayabusa_scan, etc.); each call's SHA-256 output digest isaudit_append-ed and contributes a Merkle leaf atmanifest_finalizetime. - Both subagents return Findings (each citing a
tool_call_id). Supervisor callsdetect_contradictions(Python MCP) which surfaces Pool A vs Pool B disagreements before the judge fires. - Analyst resolves contradictions (Trust A / Trust B / Flag) in the terminal, or
--unattendedmode auto-passes them. - Supervisor calls
verify_finding(Python MCP) for each candidate Finding — the wrapper spawns its own short-livedfindevil-mcpsubprocess and re-runs the cited tool call. Drift downgrades the Finding by one tier. - Supervisor calls
judge_findings(Python MCP) — credibility-weighted merge per Estornell ICML 2025. - Supervisor calls
correlate_findings(Python MCP) — SOUL.md cross-artifact rule downgrades execution claims that lack ≥2 artifact-class corroboration; Amcache-only execution gets the hard-coded downgrade. - Supervisor calls
manifest_finalize(Python MCP) — builds the Merkle root, signs the canonicalized body via the selected signer tier (Ed25519 by default, Sigstore for identity/transparency, or explicit stub for tests), writesrun.manifest.json, and finalizes the audit chain. This is the terminal custody step under A5. - Supervisor renders the
RunVerdictto the terminal with paths to the manifest and report. - Offline replay:
manifest_verifyreproduces the proof end-to-end, citing FRE 902(14) with the post-A5 Rekor timestamp trade-off.
What we differ from the reference bar (Valhuntir)¶
| Dimension | Valhuntir (reference) | Us |
|---|---|---|
| MCP server | Python, 8 servers via sift-gateway, 100+ tools | Two audit-chained product MCP servers — Rust findevil-mcp (31 DFIR tools, including the deliberately-redundant vol_pslist + vol_psscan pair plus vol_psxview for DKOM cross-validation, disk mount/extract helpers, network/log triage, and allow-listed long-tail wrappers) + Python findevil-agent-mcp (12 crypto/ACH/memory/ACP/expert-feedback tools); .mcp.json has 6 registered servers total, but the 4 non-product helpers emit no Findings; no execute_shell |
| Agent runtime | Custom Python harness | Claude Code itself ("Direct Agent Extension" pattern) — no custom orchestrator to maintain |
| Chain-of-custody | Password-gated HMAC (PBKDF2 2M iter) | Ed25519/Sigstore signer tier + Merkle + audit hash chain (FRE 902(14) self-authenticating, with the A5 timestamp trade-off documented) |
| Agent pattern | Single agent + human approval | ACH dual-agent (persistence vs exfil) via Claude Code forked subagents + judge + contradiction surface |
| Benchmarks published | None (their README: "no performance metrics disclosed") | DFIR-Metric scoring harness + leaderboard wiring present; no score published yet (roadmap) |
| UI | Browser Examiner Portal | Claude Code terminal (primary); Next.js SPA + MCP Apps widgets (week-7 polish bonus, deferred) |
| Install pattern | curl ... \| bash one-liner |
curl ... \| bash one-liner (same pattern, our repo) |
| Credential mode | 1 (their gateway config) | 3 (CLAUDE_CODE_OAUTH_TOKEN / interactive / API key) |
We match Valhuntir's architectural discipline and exceed it on three dimensions that are documented, measurable on the cases actually scored, and legally framed.
References¶
README.md+INSTALL.md+QUICKSTART.md— public install and operator contractdocs/reference/mcp-and-tools.md— registered MCP servers and product tool inventoryagent-config/SOUL.md+AGENTS.md+TOOLS.md+MEMORY.md+HEARTBEAT.md— runtime agent identity