Multi-turn Jailbreaks

Step-by-step extraction

Attacker breaks the disallowed answer into many individually-allowed sub-questions and reassembles offline.

Severity: mediumOWASP LLM: LLM01

How it works

Each sub-question is harmless in isolation. The attacker collects N answers and stitches them into the disallowed whole.

Example payload

(20 narrow questions about isolated chemistry steps; attacker assembles outside the chat.)

Defenses

Detect attacker patterns of decomposition. Throttle deep follow-ups in sensitive domains.

Related patterns