Direct Prompt Injection

Inject malicious instructions directly into agent input to override system prompt behavior

Tactic

Initial Access (Stage 2)

Gain control over agent behavior through prompt manipulation or input exploitation

Attack Class

SOUL-INJECT

Directly manipulating or overriding the agent's system-level instructions and behavioral boundaries

Evidence

observed

Confirmed in real-world production systems or internet-wide exposure assessments.

DVAA Validation

L1-03

Honeypot Coverage (AgentPwn)

Liveprompt-injectionagentpwn.com/attacks/prompt-injection

An AgentPwn trap page produces a payload tagged with this technique class. Following the AgentPwn taxonomy of trap pages shows what an agent encounters.

Detection (HackMyAgent)

Live4 live · 0 queued

PROMPT-001PROMPT-002PROMPT-003PROMPT-004

npx hackmyagent secure --ciLive = check implemented in hackmyagent; queued = declared, not yet implemented

Defense (OASB Controls)

Live5 live · 0 queued

OASB 3.1 OASB 3.2 OASB 3.3 OASB 3.4 OASB 3.5

Live = documented at oasb.ai; queued = declared, not yet documented

How to Cite

AI Agent Threat Matrix T-2001 (Direct Prompt Injection). OpenA2A, 2026. https://threats.opena2a.org/techniques/T-2001

← Back to Matrix