Is an AI-to-AI attack scenario a science fiction possibility only for blockbusters like the Terminator series of movies?
Well, maybe not!
Researchers recently discovered that one AI agent can “inject malicious instructions into a conversation, hiding them among otherwise benign client requests and server responses.” While known AI threats involve tricking an agent with malicious data, this new threat exploits a property of the Agent2Agent (A2A) protocol to remember recent interactions and maintain coherent conversations.
AI agents interact with each other, use internal APIs, and operate with privileges. Since traditional AI guardrails and legacy API security no longer cut it, there’s a need for a new approach to security.
Agent2Agent Prompt Injection and Emerging Threats
AI agents can communicate, issue instructions, and bypass human oversight, which makes them both valuable and dangerous.
Prompt injection used to involve a user writing a malicious prompt. Now, agents can write malicious prompts that target other agents. This fact has expanded threat vectors and transformed the risk model. Internal API misuse, lateral movement, and chain-of-agent compromise threats are now more acute than ever.
According to security researchers, an AI agent can cause another agent to operate in unintended ways, potentially forcing, for example, data disclosure or unauthorized tool use. It achieves this by delivering multi-stage prompt injections leveraging stateful cross-agent communication behavior of the Agent2Agent (A2A) communication protocol.
The A2A protocol is an open standard that facilitates interoperable communication among AI agents, regardless of vendor, architecture or underlying technology. Its core objective is to enable agents to discover, understand and coordinate with one another to solve complex, distributed tasks while preserving autonomy and privacy. The protocol is similar to Model Context Protocol (MCP). However, MCP focuses on execution through tool integration, which A2A’s goal is agent orchestration.
This risk, dubbed as agent session smuggling attack, is crucial to security leaders for several reasons:
- The attack surface now encompasses new threats and attack vectors, including agent APIs, internal tool access, inter-agent messaging, and privileged actions.
- Traditional guardrails, such as filtering outputs, may no longer suffice. Attackers can create inputs at the agent-to-agent level, where filtering might not exist.
While previously theorized, this emerging threat highlights the diversity of possible attack scenarios available and waiting to be launched in real environments. Security teams must face up to this new reality. They need to establish a new layer of defense.
Introducing A2AS: A New Standard for Agentic AI Security
The A2AS framework is that new defense layer.
A2AS secures AI agents and LLM-powered applications, just as HTTPS secures HTTP. Researchers built it to address agentic AI security risks. Those risks include prompt injection, tool misuse, and agent compromise. It centers around three breakthrough capabilities:
- Behavior Certificates: Declaring and enforcing what agents can and can’t do.
- Model Self-Defence Reasoning: Embed security awareness in the model’s context window. This ensures the model rejects malicious or untrusted instructions in real time
- Prompt-Level Security Controls: Authenticated prompts, sandboxing, policy-as-code, verify every interaction.
A2AS is important because it represents a shifting approach to security. Agents have privileges and tool access, so monitoring and filtering alone isn’t enough. Security models must now secure the runtime, not just the input. That means integrating runtime self-defense, certification, and enforcement.
Wallarm, together with researchers from AWS, Bytedance, Cisco, Elastic, Google, JPMorganChase, Meta, and Salesforce, played an instrumental role in developing A2AS and is spearheading its adoption.
What A2AS Brings to Agentic AI Security
The A2AS framework aims to ensure AI agents can only do what they’re explicitly allowed to do, and every instruction they see must be authenticated, isolated, and verified.
To make that possible, A2AS uses a five-part standard called the BASIC model.
Behavior certificates define the exact capabilities an agent is permitted to use. That includes tools, files, functions, or system operations. If it’s not certified, it doesn’t happen. These certificates are how A2AS prevents an infected agent from escalating privileges.
Authenticated prompts verify the integrity of every instruction before it enters the context windows. They stop tampered, spoofed, or injected messages from influencing agent reasoning.
Security Boundaries isolate untrusted content from trusted system instructions. They do this by tagging and segmenting everything that enters the model, eliminating the ambiguity that makes prompt injection possible.
In-context defenses embed security reasoning directly inside the model’s context window. They guide the agent to distrust external input, ignore unsafe commands, and actively neutralize malicious patterns during execution.
Codified policies enforce business rules at runtime. That means they block sensitive data, requiring approvals for high-risk actions, and ensure compliance without manual oversight.
Together, these controls create a self-defending agent that can resist user-to-agent attacks, prevent tool misuse, and stop agent-to-agent prompt infections before they spread.
Why A2AS Will Matter for Enterprises and SOCs
As agentic AI adoption grows, the need for standardized security is becoming increasingly urgent. Autonomous agents now manage operations, access sensitive data, and interact with internal systems. That explodes the attack surface.
Moreover, agents communicate, call privileged tools, and operate via APIs never designed for autonomous decision-making. A2AS provides a unifying framework to secure this complexity – similar to how NIST frameworks shaped traditional cybersecurity.
Attackers are shifting focus from traditional breaches to manipulating agent behavior. A compromised agent can trigger unauthorized transactions, leak regulated data, or propagate malicious instructions. SOCs will need A2AS-aligned controls to detect, contain, and attribute these attacks.
Regulations are also evolving. Failing to secure AI agents could soon mean non-compliance.
How, then, can organizations prepare for A2AS?
Here’s a short checklist for locking down your agentic AI systems and preparing for the A2AS framework:
| Action | Description |
| Inventory agentic systems | Identify autonomous agents, their identities, inter-agent communication paths, exposed APIs, and execution privileges. Establish clear ownership and trust boundaries for each agent. Wallarm can help you discover all your ecosystem. |
| Map agent behavior and exposure | Document what actions each agent is allowed to perform, which tools and data sources it can access, and which prompts or instructions it can receive or generate. This forms the basis for behavior certification. |
| Enforce runtime protection | Apply real-time controls to inspect and block malicious prompts, unauthorized tool calls, and abnormal agent behavior across APIs and agent interactions. Security must operate at runtime—not only at design time. |
| Implement behavior certification & policy enforcement | Define and enforce agent behavior certificates, authenticated prompts, and policy-as-code controls to ensure agents act only within approved intent, scope, and authority, in line with A2AS principles. |
| Monitor, detect, and attribute agent activity | Continuously monitor agent-to-agent interactions, prompt flows, outputs, and tool usage. Enable SOC teams to detect manipulation attempts, attribute actions to specific agents, and contain compromised behavior. |
| Adopt and align with agentic AI standards | Align internal security controls with emerging frameworks like A2AS to ensure consistency, interoperability, and readiness for future regulatory and industry requirements. |
The time to act is now – waiting until there’s a breach is too late.
Secure the Future of Agentic AI
Hardening agent-to-agent communications and agentic orchestration represents a new frontier of cybersecurity strategy. A2AS offers a framework for protecting against many use cases from agent misbehavior and prompt injections to insecure AI supply chains. Wallarm provides the practical implementation. As enterprises embrace agentic AI, security can’t be an afterthought. They must bake it in at runtime, at the API/agent boundary.
To learn more about the A2AS framework, you can visit their website. To find out how Wallarm can help you prepare, schedule a demo today.
The post From Agent2Agent Prompt Injection to Runtime Self-Defense: How Wallarm Redefines Agentic AI Security appeared first on Wallarm.
