Threat model¶
What OpenAgentLock defends, what it does not, and where the trust boundaries are.
Adversaries we care about¶
- Compromised dependency — an LLM-suggested package, an MCP server, or a transitive transitively pulls in code that calls home, exfiltrates secrets, or rewrites your tree.
- Prompt-injection-driven agent — an attacker plants instructions in a file, a comment, or a tool result that the agent reads, and the agent then runs commands the user did not ask for.
- Confused-deputy harness — the harness itself behaves correctly but its trust surface is wider than the user realizes (e.g. an MCP server they forgot they enabled three months ago).
What OpenAgentLock defends¶
- Pre-execution gating. Tool calls are evaluated before the harness runs them. A
denydoes not get to "almost happen and then unwind." - Tamper-evident audit. Every decision is hashed, signed, and committed to the local Merkle ledger. Any later modification changes the root.
- Cross-harness consistency. One policy, one ledger, one signer enrollment — applied uniformly across Claude Code, Codex CLI, Cursor, etc. as those harnesses come online.
- Locality. Nothing leaves your machine by default. The control plane binds to
127.0.0.1only. There is no telemetry.
What OpenAgentLock does not defend¶
- A compromised host. If the attacker already has root, OpenAgentLock cannot tell you that — the daemon, the ledger, and your harness all sit downstream of host trust.
- Already-resident malware. We hook the harness, not the OS. A backdoor running in another process is not in our path.
- Network-layer attacks. TLS does the wire job; we do the record job.
- Misconfiguration of the harness itself. If you tell Claude Code to skip permissions checks, our hooks still fire, but the trust shape is your call.
- Side-channel exfiltration via permitted channels. If you allow
curlto your blog, an agent can write a poem that encodes secrets in adjective choice. Policy is necessary, not sufficient.
Trust boundaries¶
flowchart TB
subgraph host["HOST — your shell, filesystem, keychain"]
direction LR
H["agent harness<br/>(Claude Code, Codex, Cursor, …)"]
CLI["agentlock CLI<br/><b>owns long-lived signing key</b>"]
H -->|hook| CLI
end
subgraph cp["CONTROL PLANE — Docker, 127.0.0.1:7878 / :7879"]
direction LR
P["policy"] --> LED["ledger"] --> DB["dashboard"]
end
CLI -->|"signed session-scoped key"| cp
style host fill:#fafafa,stroke:#bbb,stroke-dasharray:4 3
style cp fill:#f0f0f0,stroke:#666
The CLI on the host owns the long-lived key (TOTP-unlocked or hardware key). It signs a short-lived session key at startup and posts the signed bundle to the daemon. The daemon signs ledger leaves with the session key in memory. The long-lived key never crosses into the container.
YubiKey deliberately does not work inside Docker. USB HID is not bridged into Linux containers. The split is by design.
Failure modes by category¶
| Failure | Effect | Mitigation |
|---|---|---|
| Daemon dies mid-call | Harness sees a hook timeout; harness's own default applies. We never fail-open in our control. | Restart the daemon, run agentlock ledger verify |
| Session expires under load | Next ledger append fails until the CLI re-signs a session | Reduce session TTL to fit your tap cadence |
| Policy file syntax error | mode defaults to monitor; nothing blocks |
The dashboard validates before save; CI also lints |
| Long-lived key compromise | All sessions signed under that key are suspect | Rotate, then mark prior ledger range as untrusted in audit |