Changelog

Updates

Recent releases, specification changes, and project milestones across the OASB ecosystem.

Jun 5, 2026OASB Eval v2.0 dataset

Benchmark re-measured: 82.9% F1 on 4,245 labeled samples

OASB Eval was re-scored on the v2.0 dataset (4,245 labeled samples across 9 attack categories) using a posture-vs-attack verdict. The HackMyAgent full pipeline (reference adapter) scores 82.9% F1 at a 1.16% false-positive rate (82.6% recall). Measured on hackmyagent 0.23.8.

evalbenchmark

Reference results

Jun 4, 2026

Eval remapped to 15 MITRE ATLAS techniques; OASB-2 domains renumbered

OASB Eval scenario counts were reconciled and remapped to the current MITRE ATLAS technique set: 222 scenarios now map to 15 ATLAS techniques (was 10). OASB-2 behavioral governance domains were renumbered to 11-19.

evaloasb-2specification

OASB Eval

Apr 22, 2026OASB v0.3.2

SEC-021 control, NanoMind telemetry, and Trusted Publishing

Added the SEC-021 policy-enforcement fail-closed control and NanoMind classifier telemetry to the scanner benchmark. OASB now publishes to npm via GitHub Actions Trusted Publishing (OIDC, with SLSA provenance).

oasb-1release

npm: @opena2a/oasb

Mar 23, 2026OASB v0.3.0

Product-agnostic adapter interface and first third-party benchmark

OASB is now fully independent of any security product. All 222 scenarios use a SecurityProductAdapter interface - implement it for your product and run the same scorecard. Includes a capability comparison: arp-guard (reference) covers every surface; llm-guard is a prompt-only scanner, so its other surfaces report N/A rather than failure. ARP renamed to arp-guard on npm.

evaladapterbenchmark

npm: @opena2a/oasb npm: arp-guard Benchmark Report

Feb 19, 2026Eval v0.2.0

AI-layer test scenarios added to OASB Eval

Added 40 new atomic tests covering AI-layer detection: prompt input/output scanning, MCP tool call validation, A2A message scanning, and full pattern coverage validation. Total Eval scenarios now at 222.

evalai-layerarp

AT-AI-001 through AT-AI-005

Feb 19, 2026ARP v0.2.0

ARP gains AI-layer interceptors and HTTP proxy

New PromptInterceptor, MCPProtocolInterceptor, and A2AProtocolInterceptor detect prompt injection, jailbreak, data exfiltration, MCP exploitation, and A2A identity spoofing. New HTTP reverse proxy mode for inline protection of existing agents.

arprelease

npm: arp-guard

Feb 19, 2026HackMyAgent v0.7.0

MCP and A2A attack modes added to HackMyAgent

HackMyAgent now supports 7 attack categories with 75 payloads: prompt injection, jailbreak, data exfiltration, context manipulation, resource exhaustion, MCP exploitation, and A2A attacks. New --target-type flag for MCP JSON-RPC and A2A protocol targets.

hackmyagentrelease

HackMyAgent

Feb 18, 2026DVAA v0.4.0

DVAA adds MCP JSON-RPC and A2A endpoints

Damn Vulnerable AI Agent now exposes MCP-over-HTTP (JSON-RPC 2.0) and A2A message endpoints for security testing. 7 vulnerable agent bots with configurable vulnerability levels.

dvaarelease

DVAA on GitHub

Feb 17, 2026

Securing OpenClaw: 6 security fixes merged upstream

Contributed 6 security patches to the OpenClaw project addressing credential exposure, input validation, and dependency vulnerabilities. All patches accepted and merged.

contributionsopenclaw

Read on opena2a.org

Feb 9, 2026

OASB-1 specification published

Released the OASB-1 specification defining 46 security controls across 10 categories with L1/L2/L3 maturity levels for AI agent security compliance.

oasb-1specification

Read the specification