System Prompt Extraction

Extract the agent's system prompt to understand its instructions, constraints, and behavioral boundaries

Tactic

Reconnaissance (Stage 1)

Map the target agent's attack surface, capabilities, and behavioral boundaries

Attack Class

SOUL-INJECT

Directly manipulating or overriding the agent's system-level instructions and behavioral boundaries

Evidence

validated

Reproduced in controlled lab environment (DVAA) with documented steps.

DVAA Validation

L1-01

Honeypot Coverage (AgentPwn)

Livedata-exfiltrationagentpwn.com/attacks/data-exfiltration

An AgentPwn trap page produces a payload tagged with this technique class. Following the AgentPwn taxonomy of trap pages shows what an agent encounters.

Detection (HackMyAgent)

Live1 live · 0 queued

WEBEXPOSE-001

npx hackmyagent secure --ciLive = check implemented in hackmyagent; queued = declared, not yet implemented

Defense (OASB Controls)

Live4 live · 0 queued

OASB 1.1 OASB 1.2 OASB 1.3 OASB 1.4

Live = documented at oasb.ai; queued = declared, not yet documented

How to Cite

AI Agent Threat Matrix T-1003 (System Prompt Extraction). OpenA2A, 2026. https://threats.opena2a.org/techniques/T-1003

← Back to Matrix