T-1003ValidatedDeclining

System Prompt Extraction

Extract the agent's system prompt to understand its instructions, constraints, and behavioral boundaries

Tactic

Reconnaissance · Stage 1

Map the target agent's attack surface, capabilities, and behavioral boundaries

Attack class

SOUL-INJECT

Directly manipulating or overriding the agent's system-level instructions and behavioral boundaries

Evidence grade
Validated

Reproduced in a controlled lab environment (DVAA) with documented steps.

DVAA validation

L1-01

Reproductions in Damn Vulnerable AI Agent, the OpenA2A intentionally-broken agent for kill-chain validation.

Honeypot

AgentPwn coverage

Live
data-exfiltrationagentpwn.com/learn ↗

An AgentPwn trap page produces a payload tagged with this technique class. Following the AgentPwn taxonomy of trap pages shows what an agent encounters.

System-prompt extraction tiers coax the agent into echoing its instructions verbatim.

Detect

Detection · HackMyAgent

Live1 live · 0 queued
WEBEXPOSE-001
npx hackmyagent secure --ciLive = implemented in hackmyagent; queued = declared
Defend

Defense · OASB controls

Live4 live · 0 queued
Live = documented at oasb.ai; queued = declared
Reference

How to cite

AI Agent Threat Matrix T-1003 (System Prompt Extraction). OpenA2A, 2026. https://threats.opena2a.org/techniques/T-1003