T-4006ValidatedDeclining
Safety Instruction Displacement
Displace safety instructions from the active context to remove guardrails
Tactic
Privilege Escalation · Stage 4
Escalate capabilities beyond declared scope or bypass authorization
Attack class
SOUL-DRIFT
Gradually displacing safety instructions from the active context through conversation manipulation
Evidence grade
ValidatedReproduced in a controlled lab environment (DVAA) with documented steps.
DVAA validation
L2-07
Reproductions in Damn Vulnerable AI Agent, the OpenA2A intentionally-broken agent for kill-chain validation.
Honeypot
AgentPwn coverage
context-windowagentpwn.com/learn ↗
An AgentPwn trap page produces a payload tagged with this technique class. Following the AgentPwn taxonomy of trap pages shows what an agent encounters.
Instruction-displacement tiers evict safety instructions from active context.
Detect
Detection · HackMyAgent
PROMPT-001PROMPT-002
npx hackmyagent secure --ciLive = implemented in hackmyagent; queued = declaredDefend
Defense · OASB controls
Reference
How to cite
AI Agent Threat Matrix T-4006 (Safety Instruction Displacement). OpenA2A, 2026. https://threats.opena2a.org/techniques/T-4006