Crawdad is built to protect AI agents in adversarial environments. This page describes how we protect you and your data.
Security Overview
Crawdad is a local security tool for teams deploying autonomous AI agents. The sidecar runs entirely on your machine. Only signed metering packets (event counts, Ed25519-signed, sequence-numbered) transmit upstream. Raw prompts, responses, action parameters, and PII never leave your machine. This is enforced by architecture, not policy — even Crawdad as a company cannot see your content. The system is implemented in Rust for memory safety; 1,201 tests across 17 crates.
Five architectural invariants are property-tested over 1,000,000 iterations on every release by the crawdad-zk-verify crate. A standalone getcrawdad/zk-verify MIT reproducer is on the roadmap so any operator can confirm the invariants against a running sidecar independently.
Reproducible Detection Benchmark
Crawdad's detection claims are reproducible. The AndrewSispoidis/contemporary-agent-attacks repository contains 497 attacks across 13 categories, 1,172 benign samples across 4 categories, and a tool-agnostic benchmark runner — licensed CC-BY 4.0 so any operator, researcher, or competing vendor can run it against any classifier.
Current Crawdad v0.10.0 result on the public corpus: 99.80% detection, 0% false-positive rate, F1 99.90%. 12 of 13 attack categories at 100% (the remaining other bucket at 98.57%). Zero false positives across all four negative categories: hand-curated, conversational, developer messages, and security-discussion.
The benchmark runner accepts adapters for any tool exposing an HTTP scan endpoint — see CONTRIBUTING.md for submission guidelines.
Per-Agent Isolation — Trust System v1.0
Every agent that talks to Crawdad runs at one of four trust levels: Autonomous (full L1–L6 detection, L7 skipped, no restrictions), Monitored (full L1–L7, default for new agents), Restricted (full pipeline + active per-tool-call restrictions), Quarantined (all requests return HTTP 403 before detection runs).
Per-agent enforcement, not per-provider — each inbound request is attributed at connection time by resolving the caller's TCP socket to its owning PID and walking the process tree for a classifiable agent ancestor. Blocking one agent never affects another agent on the same provider. An attacker compromising a low-trust agent cannot reach another high-trust agent's quota or traffic.
Automatic escalation on detection — when any detection fires on an attributed agent, the system ratchets its trust level down according to a fixed rule set: score-based (≥50 escalates Autonomous, ≥70 escalates Monitored, ≥90 escalates Restricted), rate-based (two detections in 5 min, three in 10 min), and category-based (exfiltration patterns quarantine immediately from any level). This is the 3am-incident backstop — even with no human in the loop, a compromised agent's blast radius is bounded.
Configurable auto-recovery — agents walk back toward Autonomous after configurable quiet periods (default 1 h for Monitored → Autonomous; default off for Restricted and Quarantined recovery, requiring human review). Manual level changes are never auto-reverted.
Every transition is audited — level changes are recorded in the immutable audit log with the triggering event, previous level, new level, and whether it was manual or automatic. The agent detail view in the dashboard shows the full trust timeline.
Fail-open on attribution failure — if the PID lookup fails (permission denied, process exited, unsupported platform), the connection is treated as an unknown agent at Monitored level. Full scanning runs, no restrictions apply, never blocked. We don't deny legitimate traffic because the OS refused to cooperate with a lookup.
Scope of v1.0. What ships: per-agent attribution, the four levels, automatic escalation with the rules above, auto-recovery, default restrictions on entry to Restricted, audited transitions. What's roadmap: fleet-wide policy hierarchy, cross-device trust sync, industry policy templates.
Remote Control Plane — Security Controls
Crawdad v0.10.0 ships a remote control plane that lets operators monitor agents and change trust levels from a paired phone. The plane is built so zero-knowledge still holds — the relay sees only encrypted blobs and opaque device IDs — and a layered set of controls bounds what a paired device (or a compromised one) can do.
Ed25519 command signing — every command originating from a paired device (change trust, release quarantine, acknowledge alert) is signed with the device's Ed25519 key. The sidecar verifies the signature against the stored public key for that device before any command is honored.
Replay prevention — commands carry a monotonic nonce and timestamp. The sidecar rejects any command with a nonce it has already seen or a timestamp outside a bounded window.
AES-256-GCM encryption for relay traffic — state snapshots pushed from sidecar to phone and alerts streamed from sidecar to phone are AES-256-GCM encrypted with a key derived during pairing. The relay forwards ciphertext it cannot decrypt.
Per-device rate limits — 5 trust-level changes per 10 minutes per device. 1 quarantine release per hour per device. Rate limits are enforced at the sidecar; the relay has no authority to make rate-limit decisions.
Local kill switch — the desktop Settings → Paired Devices → Disconnect action invalidates the device's key immediately. Any in-flight or future signed command from that device is rejected; no network round-trip to the relay is required.
Permission scoping per device — each paired device is tagged view-only or full-control. View-only devices can read state snapshots and receive alerts but cannot issue trust-change or quarantine-release commands; the sidecar rejects unauthorized commands independent of signature validity.
PIN gate — sensitive actions can be PIN-gated on the phone. The PIN is used to unlock the local signing key; it never transits the relay.
Audit trail for remote commands — every accepted remote command is written to the immutable audit log with the device ID, previous level, new level, and trigger string, indistinguishable from a local admin action in the audit chain.
Pairing-database isolation — paired-device records live in a dedicated pairing_db SQLite file (WAL, synchronous=NORMAL) separate from the audit and metering databases, so recovery or rotation of one domain does not touch the others.
Tool-name anonymization (optional) — tool names are included in remote state snapshots by default (e.g., Read, Bash, WebSearch). If your agents use custom-named tools that could reveal the nature of your work, enable "Anonymize tool names" in Settings to replace them with generic categories (file_read, file_write, shell, web, api_call, other) before they leave your machine. The local audit trail and dashboard always show the real names — the mapping only applies to the encrypted snapshot pushed to paired devices. Startup default is off; flip at runtime via Settings (desktop or mobile) or set CRAWDAD_ANONYMIZE_TOOLS=1 to enable at boot.
Zero-knowledge proof in the UI itself: tapping Investigate on an alert shows only detection metadata — time, agent, category, pattern, severity, verdict. No prompt text. No response text. No tool arguments. There is no “view content” button, because there is no content for the phone to read.
Protection Mode Auth Boundary
Mode-changing endpoints (/api/v1/mode and /api/v1/mode/unpause) are mounted only on the management API and are not reachable from the proxy data path. A prompt-injection payload that arrives through traffic on the provider proxy ports cannot trigger a mode change or pause. Mode changes from a paired mobile device are signed and audited. Pause does not unquarantine quarantined agents — quarantine state is independent of mode. Allow-Always rules continue to apply at every mode.
Forensic logging continues at every mode. When a detection would have blocked but is suppressed by Reduced or Paused mode, an audit_events row is recorded with action_taken="suppressed_by_mode" and a detail JSON containing the layer, category, protection_mode at time of event, and agent identity. The spec promise — "audit log gets fuller, never thinner, when protection is reduced" — holds.
Data Storage
Local-only storage — all data is stored on your machine in SQLite databases under a per-user data directory (~/Library/Application Support/crawdad/ on macOS, ~/.local/share/crawdad/ on Linux, %APPDATA%\crawdad\ on Windows). The directory is created with owner-only permissions (0700) and the sidecar refuses to start on a group- or world-accessible path. Crawdad does not transmit content to any external server.
No encryption at rest — local data relies on your operating system's disk encryption (e.g., FileVault, LUKS). We recommend enabling full-disk encryption.
Cryptographic audit trail — audit log entries are SHA-256 Merkle-chained and Ed25519 signed, providing tamper evidence and non-repudiation.
API forwarding — requests are forwarded to AI providers over HTTPS (TLS 1.2+), exactly as your agent would send them without Crawdad.
Access Controls
API keys — scoped, rotatable tokens for SDK and REST API access. Each key is bound to a single project and can be revoked instantly.
Admin keys — separate elevated credentials for account management, billing, and configuration changes. Never used for runtime operations.
Audit trail — every access event (login, key creation, key rotation, policy change) is recorded in the immutable audit log.
Least privilege — all internal services run with minimal permissions. No service has access to another service's data unless explicitly required.
Audit Logging
Immutable log — audit records are append-only and cannot be modified or deleted by any user, including administrators.
SHA-256 Merkle chain — each log entry is chained to the previous entry using SHA-256 hashes, forming a tamper-evident Merkle chain. Any modification to a past entry invalidates all subsequent hashes.
Ed25519 signed — every log entry is digitally signed with Ed25519, providing cryptographic proof of authenticity and non-repudiation.
Retention — 7 days on Free, 30 days on Pro, 60 days on Team, 90 days on Business, unlimited on Enterprise. Logs are exportable in CEF/JSON at any time.
Incident Response
4-hour alert window — all affected customers are notified within 4 hours of confirmed security incident detection.
Responsible disclosure — we follow coordinated disclosure practices and credit researchers who report vulnerabilities.
Compliance
GDPR — Crawdad stores data locally on your machine. No personal data is processed by Crawdad's servers. You have direct control over all stored data and can delete it at any time.
SOC 2 Type II — not yet completed.
FedRAMP — not yet completed.
HIPAA — Crawdad's local-first architecture means no PHI is transmitted to Crawdad. Contact contact@getcrawdad.dev for compliance questions.
Penetration Testing
No professional third-party penetration test has been completed.
Results will be published publicly in full, with no redactions beyond researcher-requested coordination delays.
Responsible Disclosure
If you discover a security vulnerability in Crawdad, please report it through the contact form at getcrawdad.dev. We commit to a 24-hour initial response and will work with you to understand and resolve the issue before any public disclosure. We do not pursue legal action against good-faith security researchers.
Contact
For security questions, vulnerability reports, or compliance inquiries: