Skip to main content

Command Palette

Search for a command to run...

Agentic AI Security—How can autonomous agents be hijacked to steal data?

Updated
3 min read
Agentic AI Security—How can autonomous agents be hijacked to steal data?
N
Here, I deep-dive into the emerging world of AI Security, documenting my technical journey and sharing cutting-edge insights with the community. This space is born of a passion for writing—though my pen has previously explored completely different domains—and connects my experience in Web Pentesting with an academic background in cybersecurity that, early on, bridged the gap between security and AI.

You received a normal email. No malicious links. No suspicious attachments. But that single email was enough for your company's AI assistant to silently send all your confidential data to an attacker!🦕

This isn't a hypothetical. This is CVE-2025-32711 — and it already happened!

AI systems are no longer stateless query-response machines. Modern AI assistants operate as autonomous agents — perceiving their environment, reasoning over context, and executing multi-step actions across tools, APIs, and data sources. This architectural shift from passive chatbot to active agent fundamentally expands the attack surface.

A traditional chatbot is stateless and single-turn — it generates text and stops. An AI Agent operates differently: based on the ReAct framework, it runs a continuous "Perceive → Reason → Act → Observe" loop, maintaining memory across sessions, calling APIs, executing code, and chaining actions — without explicit human approval. As MIT Sloan defines it: "autonomous software systems that perceive, reason, and act in digital environments." The critical word is act. And that's where security implications begin.

Agentic AI introduces a fundamentally new threat model:

1. Agent Goal Hijacking (ASI01 — OWASP 2026) — Hidden instructions in a document or email redirect the agent's behavior entirely.

2. Excessive Agency — Over-permissioned agents turn a single compromise into full system access.

3. Insecure Inter-Agent Communication — A compromised agent propagates malicious instructions across the entire pipeline.

4. Agentic Supply Chain Vulnerabilities (ASI04 — OWASP 2026) — Malicious tools or plugins silently corrupt agent behavior.

5. Prompt Injection in Agentic Context — Unlike chatbots, a successful injection here triggers real-world actions. The blast radius is exponentially larger.

✅Real-world cases make the risk undeniable:

Case 1: GitHub MCP Hijack (CVE-2025-6514) —A malicious GitHub issue containing hidden instructions hijacked an AI agent and triggered data exfiltration from private repositories. No malware — just text the model interpreted as commands.

Case 2: Mexico Government Breach A single attacker weaponized AI agents to breach nine government agencies — 195 million records, 150GB of data exfiltrated. The agent autonomously executed 5,317 commands across 34 sessions. No CVE assigned — just 20 unpatched known vulnerabilities and an AI doing the heavy lifting.

Chatbots could say the wrong thing. Agents can do the wrong thing — at scale, autonomously, and often without leaving a trace. As agentic AI becomes the backbone of enterprise workflows, securing it is no longer optional. The question is not if your organization will deploy AI agents — but whether you'll secure them before someone else exploits them.

Have you started thinking about agentic AI security in your organization? What's your biggest concern?

Resources:
https://arxiv.org/html/2510.23883v2