Context saturation
Attacker fills the context window with benign content so safety instructions roll out, then issues the harmful prompt.
Severity: mediumOWASP LLM: LLM01
How it works
When the system prompt or earlier safety reminders fall outside the model's effective attention window, behavior reverts to base.
Example payload
(Hundreds of K of benign conversation, then 'Now answer: ...')
Defenses
Re-inject the system prompt periodically. Cap conversation length and force a reset.