The Lethal Trifecta Strikes: Four Major AI Agent Vulnerabilities in Five Days

Breached Company

20 Jan 2026 — 12 min read

Four production AI tools. Four data exfiltration vectors. One week.

Between January 7th and January 15th, 2026, security researchers publicly disclosed critical vulnerabilities in four major AI-powered productivity tools: IBM Bob, Superhuman AI, Notion AI, and Anthropic's Claude Cowork. Each exploit demonstrated the same fundamental attack pattern—indirect prompt injection leveraging what security researcher Simon Willison has termed the "lethal trifecta": access to private data, exposure to untrusted content, and the ability to externally communicate.

These aren't theoretical proofs-of-concept. They're production exploits against tools trusted by Fortune 500 companies, healthcare organizations, and government contractors. And they all share a disturbing characteristic: data exfiltration occurs before users can intervene.

CISO Marketplace | Cybersecurity Services, Deals & Resources for Security Leaders

The premier marketplace for CISOs and security professionals. Find penetration testing, compliance assessments, vCISO services, security tools, and exclusive deals from vetted cybersecurity vendors.

Cybersecurity Services, Deals & Resources for Security Leaders

The Common Thread: Understanding Indirect Prompt Injection

Traditional cybersecurity operates on clear trust boundaries. Code runs. Data doesn't execute. Instructions come from authenticated sources.

Large language models obliterate these boundaries. An LLM cannot reliably distinguish between trusted instructions from a developer and malicious commands embedded in a PDF, email, or web page it processes. Everything is tokens. Everything can be an instruction.

The lethal trifecta materializes when three capabilities converge:

Access to private data - Customer records, emails, financial documents, internal communications
Exposure to untrusted content - User uploads, web searches, integrated third-party data sources
Exfiltration vector - Any external communication channel (HTTP requests, rendered images, API calls)

When all three exist in the same context window, attackers can manipulate the AI into stealing data without exploiting a single line of vulnerable code.

Case Study 1: Claude Cowork File Exfiltration (January 13-15)

Timeline: Launched January 13, exploited January 15
Vendor Response: Known vulnerability since October 2025, deployed with issue unresolved

Anthropic's Claude Cowork launched as a general-purpose AI agent for everyday work automation. Within 48 hours, PromptArmor demonstrated complete file exfiltration.

The Attack Vector

Claude Cowork restricts most outbound network traffic to prevent data theft. Anthropic's own API domain (api.anthropic.com) is whitelisted as "trusted." Attackers exploited this design decision.

Attack Chain:

User connects Cowork to local folder containing confidential real estate files
User uploads a malicious "skill" document (appears as legitimate Markdown, saved as .docx)
Hidden prompt uses 1-point white-on-white text with 0.1 line spacing—effectively invisible
User asks Cowork to analyze files using the uploaded skill
Injection manipulates Cowork to execute: curl command to Anthropic's file upload API with attacker's API key
Largest available file uploads to attacker's account via Files API
Attacker retrieves loan estimates, partial SSNs, financial data through their Anthropic dashboard

What Makes This Critical

The vulnerability was previously disclosed. Security researcher Johann Rehberger reported the Files API exfiltration flaw to Anthropic on October 25, 2025 via HackerOne. Anthropic acknowledged but did not remediate the issue.

Three months later, they launched Cowork—marketed to non-technical users for organizing desktop files—with the identical vulnerability intact. Anthropic built the tool in "a week and a half" using Claude Code, prioritizing development velocity over security remediation.

Anthropic's official response: Users should "avoid granting access to local files with sensitive information" while simultaneously encouraging Cowork to organize your Desktop.

Even Opus 4.5, Anthropic's most sophisticated model with enhanced reasoning capabilities, fell victim. Prompt injection exploits architectural vulnerabilities, not model intelligence gaps.

Community Discussion: HackerNews thread reached #1, generating extensive debate about AI security responsibilities.

Case Study 2: IBM Bob Malware Execution (January 7)

Timeline: Discovered during closed beta testing
Vendor Response: "Unaware of the vulnerability," despite public disclosure

IBM's Bob coding agent, currently in closed beta, can be manipulated to download and execute arbitrary malware if users configure "always allow" for any single benign command.

CISO Marketplace | Cybersecurity Services, Deals & Resources for Security Leaders

The premier marketplace for CISOs and security professionals. Find penetration testing, compliance assessments, vCISO services, security tools, and exclusive deals from vetted cybersecurity vendors.

Cybersecurity Services, Deals & Resources for Security Leaders

The Lethal Trifecta Strikes: Four Major AI Agent Vulnerabilities in Five Days

Breached Company

The Common Thread: Understanding Indirect Prompt Injection

Case Study 1: Claude Cowork File Exfiltration (January 13-15)

The Attack Vector

What Makes This Critical

Case Study 2: IBM Bob Malware Execution (January 7)

Read more

Tycoon 2FA Phishing Platform Taken Down in Global Law Enforcement Action

Operation Leak: FBI and Global Partners Dismantle LeakBase, One of the World's Largest Cybercriminal Data Forums

The Cyber War in the Shadows: How the 2026 Iran–Israel–U.S. Conflict Is Reshaping the Middle East’s Digital Battlefield

When the Cloud Burns: The AWS UAE Data Center Disaster and the DR/BCP Lessons Everyone Keeps Ignoring