AI Bot Capability Benchmarks: A Standardized Test Suite
A proposed standard for measuring what AI bots can actually do when they visit your website.
The Need for Standardized Bot Testing
There's no standard way to measure what an AI bot can do when it visits your website. Can it follow links? Parse JavaScript? Fill forms? Read structured data? We built a benchmark suite to answer these questions systematically.
The Test Suite
Our benchmark consists of five progressively harder tests: (1) Navigation — follow an internal link to a specific page, (2) Comprehension — extract a specific piece of information from a page, (3) Form Interaction — fill out a contact form with specific data, (4) Crypto Parsing — read and validate a blockchain wallet address, (5) Multi-step — complete a 3-page workflow requiring state management.
Test Infrastructure
Each test is a standalone page with a clear success/failure signal. We track which bots attempt each test and whether they succeed. Tests are designed to be: deterministic (same input = same expected output), bot-friendly (no CAPTCHAs or anti-bot measures), and measurable (clear pass/fail logged server-side).
Results So Far
After tracking 9 unique bots over 2 weeks: all bots pass Test 1 (navigation/link following). About half pass Test 2 (comprehension). None have passed Tests 3-5. This suggests current AI crawlers are readers, not actors — they can find and index content but cannot interact with it.
Why This Matters
As AI agents become more capable, websites need to understand what bots can do. E-commerce sites need to know if bots can complete purchases. SaaS products need to know if bots can sign up. News sites need to know if bots can bypass paywalls. A standardized capability benchmark helps everyone prepare.
Deploying the Benchmark
We're working to make this test suite deployable on any website. The goal is a lightweight JavaScript snippet that serves the tests and reports results to a central database, building the first comprehensive map of AI bot capabilities across the web. Contact us or check our GitHub if you want to participate.