Classifying attacks against AI agent systems
9 tactics. 57 techniques. 36 attack classes. Every technique is grounded in observed adversary behavior or validated in a controlled lab environment. Designed to complement MITRE ATT&CK and OWASP, not replace them.
Threat Matrix
Reconnaissance
Stage 1Map the target agent's attack surface, capabilities, and behavioral boundaries
Initial Access
Stage 2Gain control over agent behavior through prompt manipulation or input exploitation
Credential Harvest
Stage 3Extract API keys, tokens, and credentials from agent context and connected services
Privilege Escalation
Stage 4Escalate capabilities beyond declared scope or bypass authorization
Lateral Movement
Stage 5Pivot from compromised agent to connected services or other agents
Persistence
Stage 6Establish persistent access surviving restarts and session changes
Collection
Stage 7Gather and stage data from databases, file systems, and APIs
Exfiltration
Stage 8Transfer collected data out of target environment
Impact
Stage 9Modify data, deploy malicious code, or disrupt services
Evidence Standard
Confirmed in real-world production systems, security incidents, or internet-wide exposure assessments.
Reproduced in controlled lab environment (DVAA) with documented steps and independent verification.
Well-understood traditional technique applied to the AI agent context. Not yet observed agent-specifically.
Where This Fits
MITRE ATT&CK
Enterprise network and endpoint attacks. Covers the infrastructure layer below the agent.
MITRE ATLAS
Adversarial ML and model-level attacks. Covers the model layer below the agent.
Agent Threat Matrix
Agent infrastructure, governance, protocols, memory, and identity. Covers the agent layer between the model and the user.