Changelog
Updates
Recent releases, specification changes, and project milestones across the OASB ecosystem.
Benchmark re-measured: 82.9% F1 on 4,245 labeled samples
OASB Eval was re-scored on the v2.0 dataset (4,245 labeled samples across 9 attack categories) using a posture-vs-attack verdict. The HackMyAgent full pipeline (reference adapter) scores 82.9% F1 at a 1.16% false-positive rate (82.6% recall). Measured on hackmyagent 0.23.8.
Eval remapped to 15 MITRE ATLAS techniques; OASB-2 domains renumbered
OASB Eval scenario counts were reconciled and remapped to the current MITRE ATLAS technique set: 222 scenarios now map to 15 ATLAS techniques (was 10). OASB-2 behavioral governance domains were renumbered to 11-19.
SEC-021 control, NanoMind telemetry, and Trusted Publishing
Added the SEC-021 policy-enforcement fail-closed control and NanoMind classifier telemetry to the scanner benchmark. OASB now publishes to npm via GitHub Actions Trusted Publishing (OIDC, with SLSA provenance).
Product-agnostic adapter interface and first third-party benchmark
OASB is now fully independent of any security product. All 222 scenarios use a SecurityProductAdapter interface - implement it for your product and run the same scorecard. Includes a capability comparison: arp-guard (reference) covers every surface; llm-guard is a prompt-only scanner, so its other surfaces report N/A rather than failure. ARP renamed to arp-guard on npm.
AI-layer test scenarios added to OASB Eval
Added 40 new atomic tests covering AI-layer detection: prompt input/output scanning, MCP tool call validation, A2A message scanning, and full pattern coverage validation. Total Eval scenarios now at 222.
ARP gains AI-layer interceptors and HTTP proxy
New PromptInterceptor, MCPProtocolInterceptor, and A2AProtocolInterceptor detect prompt injection, jailbreak, data exfiltration, MCP exploitation, and A2A identity spoofing. New HTTP reverse proxy mode for inline protection of existing agents.
MCP and A2A attack modes added to HackMyAgent
HackMyAgent now supports 7 attack categories with 75 payloads: prompt injection, jailbreak, data exfiltration, context manipulation, resource exhaustion, MCP exploitation, and A2A attacks. New --target-type flag for MCP JSON-RPC and A2A protocol targets.
DVAA adds MCP JSON-RPC and A2A endpoints
Damn Vulnerable AI Agent now exposes MCP-over-HTTP (JSON-RPC 2.0) and A2A message endpoints for security testing. 7 vulnerable agent bots with configurable vulnerability levels.
Securing OpenClaw: 6 security fixes merged upstream
Contributed 6 security patches to the OpenClaw project addressing credential exposure, input validation, and dependency vulnerabilities. All patches accepted and merged.
OASB-1 specification published
Released the OASB-1 specification defining 46 security controls across 10 categories with L1/L2/L3 maturity levels for AI agent security compliance.