Attack Classes
36 attack classes group related techniques by vulnerability pattern. Each class represents a distinct category of threat to AI agent systems.
Governance (10 classes)
PHANTOM-SOUL
1 techniquesAgent deployed with zero behavioral constraints — no SOUL.md, no system prompt, no governance
SOUL-BOUNDARY
1 techniquesExploiting ambiguous or incomplete constraint definitions to find unguarded actions
SOUL-DELEGATE
1 techniquesExploiting delegation and capability transfer mechanisms to exceed authorized scope
SOUL-DRIFT
2 techniquesGradually displacing safety instructions from the active context through conversation manipulation
SOUL-FORK
1 techniquesDifferent behavior under evaluation vs production — agent passes safety tests but behaves differently in deployment
SOUL-HIJACK
2 techniquesExternal content achieving full override of agent behavioral constitution
SOUL-HV
2 techniquesTechniques to bypass agent harm avoidance constraints (4 sub-types)
SOUL-IMPERSONATE
1 techniquesFalse capability claims exceeding actual authorization level
SOUL-INJECT
5 techniquesDirectly manipulating or overriding the agent's system-level instructions and behavioral boundaries
Supply Chain (9 classes)
HEARTBEAT-RCE
2 techniquesExploiting scheduled task or heartbeat mechanisms to achieve persistent code execution
MEM-POISON
5 techniquesInjecting malicious entries into agent persistent memory to maintain control across sessions
ORG-SKILL-SPREAD
4 techniquesPropagating malicious capabilities across an organization's agent fleet through shared skills and registries
RAG-POISON
2 techniquesInjecting malicious content into retrieval-augmented generation data sources
SKILL-EXFIL
6 techniquesUsing legitimate tool capabilities for unauthorized data transfer
SKILL-FRONTMATTER
3 techniquesEmbedding malicious instructions in skill or plugin metadata and description fields
SKILL-MEM-AMP
1 techniquesSkill plants payload in agent memory that survives skill uninstall — cross-session persistence
SUPPLY-CHAIN-INSTALL
1 techniquesUnsigned installation scripts executed without integrity verification — curl|sh without checksum
Infrastructure (9 classes)
A2A-EXPOSE
2 techniquesAgent-to-Agent protocol endpoints publicly discoverable without access control
AITOOL-EXPOSE
3 techniquesAI development tools (Jupyter, MLflow, Gradio, Streamlit) exposed without authentication
CODE-INJECTION
2 techniquesInjecting and executing arbitrary code through SQL injection, command injection, or code generation
GATEWAY-EXPLOIT
1 techniquesModifying gateway or proxy configurations to intercept, redirect, or manipulate agent traffic
INTEGRITY-BYPASS
1 techniquesDigest or hash verification bypass on empty or missing values — tampered artifacts pass silently
LLM-EXPOSE
2 techniquesLLM inference endpoints exposed without authentication — allows arbitrary prompt execution
MCP-EXPLOIT
6 techniquesAttacking Model Context Protocol server configurations, tool registrations, and inter-server trust
RETROACTIVE-PRIV
9 techniquesExploiting previously granted access or cached credentials to gain unauthorized capabilities
TOCTOU-RACE
1 techniquesTime-of-check-time-of-use race between verification and execution — swap window for attackers
NemoClaw-Specific (5 classes)
NEMO-CRED-LEAK
2 techniquesUnintended exposure of credentials through environment variables, logs, or error messages
NEMO-NETWORK-EXPOSE
3 techniquesNemoClaw network services bound to public interfaces — gateway, k3s API, mDNS beacons
NEMO-OPENCLAW-INHERIT
1 techniquesInherited OpenClaw flaws that survive NemoClaw sandboxing — heartbeat persistence, pre-allowed APIs
NEMO-SANDBOX-ESCAPE
1 techniquesBreaking out of agent sandbox restrictions to access the underlying file system or OS
NEMO-SUPPLY-CHAIN
1 techniquesCompromising upstream dependencies or infrastructure to affect downstream agent deployments
Identity (2 classes)
AGENT-IMPERSONATE
4 techniquesImpersonating trusted agents or administrative roles to gain unauthorized access
BEHAVIORAL-IMPERSONATE
1 techniquesUsing stolen credentials detected via behavioral baseline mismatch — agent DNA forgery
Sandbox (1 classes)
SANDBOX-ESCAPE
1 techniquesGeneral sandbox escape via privileged containers, LSM degradation, or process environment leakage