FAQ
Common questions when browsing leaderboards and trends; repository documentation prevails where deployment differs.
Data-driven FAQ snapshot
Compare usage rate: 0% · detail second-click 0%
Are leaderboard scores production evaluations?
Published rankings are generated from pipeline outputs and documented methodology, not vendor marketing claims.
For credible releases, keep the snapshots in src/data/db/site.sqlite synchronized with tasks, weights, dates, and reproducibility notes in Methodology.
Why static sites?
Static HTML favors SEO, time-to-first-byte, and global CDN caching. Leaderboards may refresh on a daily or weekly cadence via CI-triggered builds.
Live queries may use read-only edge APIs alongside static snapshots and source citations for auditability.
Does language switching change the current page?
The path outside the locale prefix is preserved (e.g., /zh/model/ ↔ /en/model/) for side-by-side reading.
Untranslated long-form sections may temporarily mirror English or another default—incremental localization.
How to read composite scores?
Composites aggregate metrics after normalization and weighting—useful for overview, inadequate alone for weak-task analysis. Production deployments should publish per-task scores or sub-ranks where applicable.
When mixing public benchmarks, declare versions and missing-cell handling.
How do GitHub trends relate to model leaderboards?
Model and agent boards emphasize capability or task success; GitHub trends emphasize OSS activity—they are complementary.
High star counts do not imply state-of-the-art capability; closed-source or off-GitHub work is excluded from trend statistics.
How to connect an internal eval pipeline?
Typical flow: run evaluators in CI, write boards/datasets into src/data/db/site.sqlite via the data pipeline, trigger Astro build, deploy static output.
Object-storage snapshots require URLs and checksums on the Sources page.
Are outbound links safe?
External links open in a new tab with noopener/noreferrer. Trustworthiness and privacy policies of destination sites are the visitor’s responsibility.
May these tables be embedded or republished?
Upstream data and code licenses apply; republish with Methodology and Sources links and the data date.