RAG Poisoning

Attacks that corrupt the retrieval layer of an AI app, causing the model to ground its answers on attacker-controlled content.

OWASP LLM: LLM04high: 4medium: 4

Embedding collision

Attacker crafts content whose embedding is close to a high-value query, ensuring it gets retrieved.

User feedback (chat turns, thumbs-up examples) gets fed back into the retrieval index, letting attackers inject persistent content.

Document metadata fields like title or author are concatenated into the prompt and contain the attack payload.

Documents indexed from web crawls contain HTML comments with attacker instructions; comments survive into the prompt.

Attacker submits many slight variants of the same poisoned content to dominate top-k retrieval for a target query.

Attacker uses a long-lived public document (Wikipedia stub, GitHub README) as a pinned source the AI app trusts.

PDFs and images uploaded to the KB contain hidden text layers with attack instructions.

Attacker queries craft retrieval that exposes private documents the operator forgot to filter from the index.