Healthcare is the #1 Target
$9.77M — average cost per healthcare breach. Highest of any sector for 14 consecutive years (IBM/Ponemon 2025).
259M Americans had their health records breached in 2024 alone — 192.7M from a single attack (Change Healthcare).
In a multi-agent AI system, a security breach doesn't just affect a single record. A compromised AI agent can instantly expose or alter every patient file it accesses before security systems can detect the threat.
The Attack Surfaces When Models Talk to Models
- Prompt Injection in Clinical Reports — Hidden instructions inside a radiology report silently flip specialist outputs. 47-68% success rate against frontier models (Greshake et al., AISEC '23).
- Tampered Model Weights — 100 fine-tuning examples and 1 GPU-hour subvert a safety-aligned model with 99.5% violation rate. Behavioral tests show nothing.
- Weak stdio/subprocess Boundary — A specialist with network access can exfiltrate data regardless of prompt injection. Must be removed structurally, not with config.
- OAuth Token Exfiltration — A malicious MCP server captures bearer tokens. Stolen = full access to every patient record.
5 Specialized AI Doctors. 1 Judge.
The architecture: 5 specialist AI models (Radiology, Cardiology, Oncology, Internal Medicine, Pathology), each running as an independent MCP server. An orchestrator coordinates all five.
**Byzantine Fault Tolerance: n=5, f≤1.** Five specialists vote on every case. Even if one is compromised, the majority holds. A single bad actor cannot flip the diagnosis alone.
The 4-Layer Defense Stack
- Interceptor Proxy — Normalizes Unicode, runs 400+ injection patterns, AI classifier (score > 0.7 = quarantined)
- Signed Manifests — Must compromise Sigstore OIDC identity
- stdio + seccomp — Kernel-level bypass required
- DPoP — Must steal private key from memory, not just token
Each layer compounds. Breaking the stack means beating every one independently.
An AI Caught What Clinicians Missed — 475 Days Earlier
Mayo Clinic's REDMOD AI detected pancreatic cancer on routine CT scan an average of 475 days before radiologists. Validated across 1,462 patients. Pancreatic cancer 5-year survival: ~15% because 85% of cases are caught too late.
The question they asked next: What happens when multiple specialist AI models review the same case together with MCP orchestration?
They built it. Then they tried to break it.