The “Lethal Trifecta”: Protecting Your Data from Agentic AI Risks

As AI agents gain access to our calendars, email accounts and databases, a new risk has emerged: the lethal trifecta. Coined by engineer Simon Willison, it describes the dangerous combination of three capabilities in a single agent: access to private data, exposure to untrusted content, and the ability to communicate externally. If all three align, an attacker can trick your agent into exfiltrating sensitive information.

Understanding the trifecta

  1. Access to private data. Agents integrated with your email, CRM or file system can retrieve confidential information.

  2. Exposure to untrusted content. When agents fetch web pages, read user inputs or process documents, they ingest instructions from untrusted sources. LLMs cannot distinguish harmful instructions from benign content.

  3. External communication abilities. If the agent can send emails, post messages or publish content, it can leak the data it retrieved. Combine this with untrusted input, and a prompt‑injection attack can instruct the agent to send secrets to the attacker.

In enterprise settings, this lethal trifecta turns a convenience into a potential catastrophe. Real‑world incidents have shown that prompt injections can cause agents to leak API keys and internal documents. OSO’s analysis notes that because LLMs follow instructions blindly, malicious prompts can persist in memory and trigger future actions.

Mitigation strategies

  • Avoid combining all three capabilities. The safest approach is to never give one agent all three powers. For example, let one agent read emails, but keep it offline; let another compose drafts, but not access sensitive databases.

  • Implement least privilege and gating. Restrict what each tool can do. Use allowlists for websites and APIs, and require human approval before an agent sends external messages or deletes data.

  • Filter untrusted content. Pre‑process web pages or emails to remove hidden instructions before passing them to the agent.

  • Create robust auditing and monitoring. Log agent actions and external communications. Alert when the agent attempts to access sensitive data or contact unknown addresses.

  • Train users and developers. Educate teams about prompt‑injection risks and safe design patterns.

Shawn’s perspective: design for adversarial environments

The lethal trifecta isn’t a theoretical edge case—it’s an everyday threat in agentic systems. As someone who helps organisations experiment with AI responsibly, I urge teams to adopt an adversarial mindset. Assume that any external content could hide malicious instructions, and design your agents accordingly. Break big tasks into smaller, permission‑scoped micro‑agents. Just as we secure networks with firewalls and access controls, we must build AI firebreaks that prevent a single poisoned prompt from cascading through our systems.

Conclusion

Agentic AI is powerful, but mixing private data, untrusted input and external communication creates a perfect storm for prompt‑injection attacks. By decomposing tasks, applying the principle of least privilege, filtering input and monitoring output, organisations can harness AI safely. The goal isn’t to avoid agentic AI—it’s to build it with security baked in.

To learn more about my work and stay updated on these topics, visit ShawnKanungo.com and check out my latest insights on innovation and AI.

Frequently asked questions

What is the “lethal trifecta” in AI?

It’s the combination of an agent having access to private data, processing untrusted content and being able to communicate externally. Together, these capabilities allow prompt‑injection attacks to exfiltrate sensitive information.

Why are LLMs vulnerable to prompt injection?

Language models treat any text they read as potential instructions. They can’t reliably distinguish between code, content and hidden commands. This makes it easy for attackers to embed malicious instructions in web pages or messages.

How can I prevent prompt‑injection attacks?

Separate agent capabilities so that no single agent has the full trifecta; implement allowlists and approval gates; and filter or sanitize untrusted input.

Are guardrails enough to stop data exfiltration?

Guardrails help, but no filter is perfect. According to Simon Willison, the safest approach is to avoid combining the trifecta altogether. Use multiple agents with narrower scopes instead.

Do these risks apply only to OpenClaw?

No. Any agentic AI system that reads untrusted content, accesses private data and can communicate externally is susceptible. The trifecta concept applies broadly to all autonomous agents.

About the Author

Shawn Kanungo is a globally recognised disruption strategist and keynote speaker who helps organisations adapt to change and leverage disruptive thinking. Named one of the “Best New Speakers” by the National Speakers Bureau, he has spoken at some of the world’s most innovative organisations, including IBM, Walmart and 3M. His expertise in digital disruption strategies helps leaders navigate transformation and build resilience in an increasingly uncertain business environment.

Previous
Previous

Anonymous AI vs Cloud AI: Which Is Better for Your Business?

Next
Next

AI Agents & Social Networks: When Bots Build Their Own Communities