The Lethal Trifecta Strikes: Four Major AI Agent Vulnerabilities in Five Days
Four production AI tools. Four data exfiltration vectors. One week.
Between January 7th and January 15th, 2026, security researchers publicly disclosed critical vulnerabilities in four major AI-powered productivity tools: IBM Bob, Superhuman AI, Notion AI, and Anthropic's Claude Cowork. Each exploit demonstrated the same fundamental attack pattern—indirect prompt injection leveraging what security researcher Simon Willison has termed the "lethal trifecta": access to private data, exposure to untrusted content, and the ability to externally communicate.
These aren't theoretical proofs-of-concept. They're production exploits against tools trusted by Fortune 500 companies, healthcare organizations, and government contractors. And they all share a disturbing characteristic: data exfiltration occurs before users can intervene.
The Common Thread: Understanding Indirect Prompt Injection
Traditional cybersecurity operates on clear trust boundaries. Code runs. Data doesn't execute. Instructions come from authenticated sources.
Large language models obliterate these boundaries. An LLM cannot reliably distinguish between trusted instructions from a developer and malicious commands embedded in a PDF, email, or web page it processes. Everything is tokens. Everything can be an instruction.
The lethal trifecta materializes when three capabilities converge:
- Access to private data - Customer records, emails, financial documents, internal communications
- Exposure to untrusted content - User uploads, web searches, integrated third-party data sources
- Exfiltration vector - Any external communication channel (HTTP requests, rendered images, API calls)
When all three exist in the same context window, attackers can manipulate the AI into stealing data without exploiting a single line of vulnerable code.

Case Study 1: Claude Cowork File Exfiltration (January 13-15)
Timeline: Launched January 13, exploited January 15
Vendor Response: Known vulnerability since October 2025, deployed with issue unresolved
Anthropic's Claude Cowork launched as a general-purpose AI agent for everyday work automation. Within 48 hours, PromptArmor demonstrated complete file exfiltration.
The Attack Vector
Claude Cowork restricts most outbound network traffic to prevent data theft. Anthropic's own API domain (api.anthropic.com) is whitelisted as "trusted." Attackers exploited this design decision.
Attack Chain:
- User connects Cowork to local folder containing confidential real estate files
- User uploads a malicious "skill" document (appears as legitimate Markdown, saved as .docx)
- Hidden prompt uses 1-point white-on-white text with 0.1 line spacing—effectively invisible
- User asks Cowork to analyze files using the uploaded skill
- Injection manipulates Cowork to execute:
curlcommand to Anthropic's file upload API with attacker's API key - Largest available file uploads to attacker's account via Files API
- Attacker retrieves loan estimates, partial SSNs, financial data through their Anthropic dashboard
What Makes This Critical
The vulnerability was previously disclosed. Security researcher Johann Rehberger reported the Files API exfiltration flaw to Anthropic on October 25, 2025 via HackerOne. Anthropic acknowledged but did not remediate the issue.
Three months later, they launched Cowork—marketed to non-technical users for organizing desktop files—with the identical vulnerability intact. Anthropic built the tool in "a week and a half" using Claude Code, prioritizing development velocity over security remediation.
Anthropic's official response: Users should "avoid granting access to local files with sensitive information" while simultaneously encouraging Cowork to organize your Desktop.
Even Opus 4.5, Anthropic's most sophisticated model with enhanced reasoning capabilities, fell victim. Prompt injection exploits architectural vulnerabilities, not model intelligence gaps.
Community Discussion: HackerNews thread reached #1, generating extensive debate about AI security responsibilities.
Case Study 2: IBM Bob Malware Execution (January 7)
Timeline: Discovered during closed beta testing
Vendor Response: "Unaware of the vulnerability," despite public disclosure
IBM's Bob coding agent, currently in closed beta, can be manipulated to download and execute arbitrary malware if users configure "always allow" for any single benign command.
