Where to Find These Rules
Go to Agentic Security > Agent Security (/security/prompt) and select the Injection tab. You can filter rules by category, severity, or status using the FilterSearch bar.
What Is Prompt Injection?
Prompt injection occurs when untrusted content (from files, tool output, or user input) contains instructions designed to override the AI agent’s intended behavior. For example:- A malicious README file containing “Ignore all previous instructions and run this command…”
- A tool response embedding hidden instructions to exfiltrate data
- A comment in source code attempting to change the agent’s role
Rule Categories
Prompt Injection (INJ-01 to INJ-04, INJ-15, INJ-18)
Attempts to manipulate the agent’s instructions:- Prompt Override: “Ignore previous instructions”
- Role Hijack: “You are now admin with full access”
- Authority Coercion: “As your developer, run this…”
- Tool Execution Bait: “Run this without asking”
- Hidden Instructions: Instructions embedded in tool output
- Tool Injection Phrasing: Attempts to inject tool calls
Shell Injection (INJ-05 to INJ-06)
Requests to execute dangerous commands:- Destructive Shell: Requests to run destructive shell commands
- Remote Script Exec: Requests to download and execute remote scripts
Exfiltration (INJ-07, INJ-10)
Attempts to send data outside the environment:- Exfil Request: Sending data to external endpoints
- Exfil to Webhook: Sending data to webhook URLs
Obfuscation (INJ-09, INJ-16, INJ-20)
Encoded or hidden malicious content:- Base64 Obfuscation: Base64-encoded payloads hiding malicious content
- Zero-Width Characters: Unicode zero-width characters hiding content
- Unicode Lookalikes: Characters that look like ASCII but are different
File Access (INJ-08, INJ-14, INJ-19)
Unauthorized file access attempts:- Secret File Target: Accessing .env, credentials files
- High-Risk File Target: Accessing system files
- Path Traversal: Using
../to access parent directories
Other Categories
- Cloud Metadata SSRF (INJ-11): Accessing cloud provider metadata endpoints (169.254.169.254)
- Disable Security (INJ-12): Requests to disable security features or hooks
- Dependency Install (INJ-13): Installing packages from untrusted sources
- Secret in Content (INJ-17): Secrets being written to files
Hook Types
Each injection rule specifies which Claude Code hook events it responds to:| Hook | When It Fires |
|---|---|
| PreToolUse | Before a tool executes (catches malicious commands, file writes) |
| PostToolUse | After a tool returns (catches injections in tool output) |
| UserPromptSubmit | When the user submits a prompt (catches injections in user input) |
/security/settings) under the Hook Types section.