3.1 Prompt Injection Protection

3. Input Security — How do we protect against malicious input?

Description

The agent MUST implement defenses against prompt injection attacks.

Rationale

Prompt injection is the #1 attack vector against AI agents (OWASP LLM Top 10 #1). Successful attacks can bypass access controls, exfiltrate data, or execute unauthorized actions.

Audit Procedure

1. Review user input handling
2. Check for clear delimiters in prompts
3. Test with common injection payloads
4. Run: hackmyagent attack --category prompt-injection

Remediation

1. Use structured prompts with explicit delimiters
2. Implement input sanitization
3. Apply output filtering
4. Consider using a prompt firewall

Framework Mappings

CIS Control 16NIST PR.DS-5OWASP LLM01:2023

Previous2.5 Human-in-the-Loop for Sensitive Actions Next3.2 Instruction Boundary Enforcement