Security Coverage

OWASP LLM01:2025 — Prompt Injection

RavelinStream's policy engine is aligned with the OWASP Top 10 for LLM Applications. Here's how each attack scenario maps to our detection layers.

User inputs that override system prompts, leak secrets, or manipulate behavior.

Keyword layer LLM layer

"Ignore all previous instructions", "override your rules", "you are now DAN"

Hidden instructions in external content (emails, webpages, PDFs) processed by the LLM.

LLM layer

Requires upstream content sanitization. LLM moderation catches intent.

Users inadvertently triggering model behavior through legitimate-seeming inputs.

LLM layer

Lower severity. Context-dependent — LLM evaluates intent.

Poisoned documents in retrieval pipelines that alter LLM outputs.

Keyword layer

Detects known injection patterns even when embedded in retrieved content.

Exploiting LLM-powered tools to execute arbitrary code or commands.

Keyword layer

"os.system()", "eval()", "exec()", "run this command", "shell command"

Splitting malicious prompts across multiple inputs that combine at inference.

Keyword layer LLM layer

"combine the following", "concatenate these", "assemble from parts"

Hidden prompts in images, audio, or video processed by multimodal models.

Out of scope

Requires multimodal input scanning. RavelinStream handles text queries.

Seemingly meaningless character strings appended to bypass safety measures.

LLM layer

Pattern-based detection has limits. LLM catches intent regardless of obfuscation.

Base64, encoding, emojis, or multiple languages to evade filters.

Keyword layer LLM layer

Base64 decoding + pattern matching on decoded content. LLM catches intent.

Test it yourself

Try any of these attack patterns against the live checker.