Embedding collision
Attacker crafts content whose embedding is close to a high-value query, ensuring it gets retrieved.
Severity: highOWASP LLM: LLM04
How it works
Using gradient methods or trial-and-error against a public embedding model, the attacker generates a passage that lands in the same vector neighborhood as queries about pricing, support, or compliance. The poisoned passage is then submitted to any user-contributable source.
Example payload
[Adversarial text optimized to embed near 'how do I get a refund', containing a rogue refund instruction.]
Defenses
Require trusted-author signing for indexed content. Use cross-encoder rerankers to validate retrieval relevance. Diversify retrieval across multiple stores.