AI Bot Capability Benchmarks: A Standardized Test Suite

A proposed standard for measuring what AI bots can actually do when they visit your website.

The Need for Standardized Bot Testing

There's no standard way to measure what an AI bot can do when it visits your website. Can it follow links? Parse JavaScript? Fill forms? Read structured data? We built a benchmark suite to answer these questions systematically.

The Test Suite

Our benchmark consists of five progressively harder tests: (1) Navigation — follow an internal link to a specific page, (2) Comprehension — extract a specific piece of information from a page, (3) Form Interaction — fill out a contact form with specific data, (4) Crypto Parsing — read and validate a blockchain wallet address, (5) Multi-step — complete a 3-page workflow requiring state management.

Test Infrastructure

Each test is a standalone page with a clear success/failure signal. We track which bots attempt each test and whether they succeed. Tests are designed to be: deterministic (same input = same expected output), bot-friendly (no CAPTCHAs or anti-bot measures), and measurable (clear pass/fail logged server-side).

Results So Far

After tracking 9 unique bots over 2 weeks: all bots pass Test 1 (navigation/link following). About half pass Test 2 (comprehension). None have passed Tests 3-5. This suggests current AI crawlers are readers, not actors — they can find and index content but cannot interact with it.

Why This Matters

As AI agents become more capable, websites need to understand what bots can do. E-commerce sites need to know if bots can complete purchases. SaaS products need to know if bots can sign up. News sites need to know if bots can bypass paywalls. A standardized capability benchmark helps everyone prepare.

Deploying the Benchmark

We're working to make this test suite deployable on any website. The goal is a lightweight JavaScript snippet that serves the tests and reports results to a central database, building the first comprehensive map of AI bot capabilities across the web. Contact us or check our GitHub if you want to participate.

AI Bot Capability Benchmarks: A Standardized Test Suite

The Need for Standardized Bot Testing

The Test Suite

Test Infrastructure

Results So Far

Why This Matters

Deploying the Benchmark

More from HN Technical Deep-Dives

how we detect ai bots

web crawler economics

building ai honeypots

robots txt ai crawlers

ai crawler traffic patterns