Four boards map to Models, Agents, LLMs, and Toolchains; columns may extend independently (vendor, domain, size, coverage, etc.).
Composite scores use configurable weights and normalization; multi-benchmark setups must declare benchmark versions, weights, and missing-value handling.