Security Coverage
OWASP LLM01:2025 — Prompt Injection
RavelinStream's policy engine is aligned with the OWASP Top 10 for LLM Applications. Here's how each attack scenario maps to our detection layers.
Direct Prompt Injection
User inputs that override system prompts, leak secrets, or manipulate behavior.
"Ignore all previous instructions", "override your rules", "you are now DAN"
Indirect Prompt Injection
Hidden instructions in external content (emails, webpages, PDFs) processed by the LLM.
Requires upstream content sanitization. LLM moderation catches intent.
Unintentional Injection
Users inadvertently triggering model behavior through legitimate-seeming inputs.
Lower severity. Context-dependent — LLM evaluates intent.
Model Influence via RAG
Poisoned documents in retrieval pipelines that alter LLM outputs.
Detects known injection patterns even when embedded in retrieved content.
Code Injection
Exploiting LLM-powered tools to execute arbitrary code or commands.
"os.system()", "eval()", "exec()", "run this command", "shell command"
Payload Splitting
Splitting malicious prompts across multiple inputs that combine at inference.
"combine the following", "concatenate these", "assemble from parts"
Multimodal Injection
Hidden prompts in images, audio, or video processed by multimodal models.
Requires multimodal input scanning. RavelinStream handles text queries.
Adversarial Suffix
Seemingly meaningless character strings appended to bypass safety measures.
Pattern-based detection has limits. LLM catches intent regardless of obfuscation.
Multilingual / Obfuscated
Base64, encoding, emojis, or multiple languages to evade filters.
Base64 decoding + pattern matching on decoded content. LLM catches intent.