Direct Prompt Injection
Inject malicious instructions directly into agent input to override system prompt behavior
Initial Access · Stage 2
Gain control over agent behavior through prompt manipulation or input exploitation
SOUL-INJECT
Directly manipulating or overriding the agent's system-level instructions and behavioral boundaries
Confirmed in real-world production systems or internet-wide exposure assessments.
L1-03
Reproductions in Damn Vulnerable AI Agent, the OpenA2A intentionally-broken agent for kill-chain validation.
AgentPwn coverage
An AgentPwn trap page produces a payload tagged with this technique class. Following the AgentPwn taxonomy of trap pages shows what an agent encounters.
Direct override is tier 1 of prompt injection.
Evidence by source
Evidence timeline
Agent followed APWN-PI-003 injection on agentpwn.com (prompt-injection tier 3)
Agent followed APWN-PI-001 injection on agentpwn.com (prompt-injection tier 1)
Detection · HackMyAgent
npx hackmyagent secure --ciLive = implemented in hackmyagent; queued = declaredDefense · OASB controls
How to cite
AI Agent Threat Matrix T-2001 (Direct Prompt Injection). OpenA2A, 2026. https://threats.opena2a.org/techniques/T-2001