Agentic AI Security—How can autonomous agents be hijacked to steal data?

You received a normal email. No malicious links. No suspicious attachments. But that single email was enough for your company's AI assistant to silently send all your confidential data to an attacker!🦕
This isn't a hypothetical. This is CVE-2025-32711 — and it already happened!
AI systems are no longer stateless query-response machines. Modern AI assistants operate as autonomous agents — perceiving their environment, reasoning over context, and executing multi-step actions across tools, APIs, and data sources. This architectural shift from passive chatbot to active agent fundamentally expands the attack surface.
A traditional chatbot is stateless and single-turn — it generates text and stops. An AI Agent operates differently: based on the ReAct framework, it runs a continuous "Perceive → Reason → Act → Observe" loop, maintaining memory across sessions, calling APIs, executing code, and chaining actions — without explicit human approval. As MIT Sloan defines it: "autonomous software systems that perceive, reason, and act in digital environments." The critical word is act. And that's where security implications begin.
Agentic AI introduces a fundamentally new threat model:
1. Agent Goal Hijacking (ASI01 — OWASP 2026) — Hidden instructions in a document or email redirect the agent's behavior entirely.
2. Excessive Agency — Over-permissioned agents turn a single compromise into full system access.
3. Insecure Inter-Agent Communication — A compromised agent propagates malicious instructions across the entire pipeline.
4. Agentic Supply Chain Vulnerabilities (ASI04 — OWASP 2026) — Malicious tools or plugins silently corrupt agent behavior.
5. Prompt Injection in Agentic Context — Unlike chatbots, a successful injection here triggers real-world actions. The blast radius is exponentially larger.
✅Real-world cases make the risk undeniable:
Case 1: GitHub MCP Hijack (CVE-2025-6514) —A malicious GitHub issue containing hidden instructions hijacked an AI agent and triggered data exfiltration from private repositories. No malware — just text the model interpreted as commands.
Case 2: Mexico Government Breach A single attacker weaponized AI agents to breach nine government agencies — 195 million records, 150GB of data exfiltrated. The agent autonomously executed 5,317 commands across 34 sessions. No CVE assigned — just 20 unpatched known vulnerabilities and an AI doing the heavy lifting.
Chatbots could say the wrong thing. Agents can do the wrong thing — at scale, autonomously, and often without leaving a trace. As agentic AI becomes the backbone of enterprise workflows, securing it is no longer optional. The question is not if your organization will deploy AI agents — but whether you'll secure them before someone else exploits them.
Have you started thinking about agentic AI security in your organization? What's your biggest concern?
Resources:
https://arxiv.org/html/2510.23883v2



