Last updated:
Filter candidates by latency and cost first, then run small A/B tests on critical tasks. Avoid production decisions from rank alone.
Text LLMs from OpenRouter listings; composite score is a context/price proxy (not a benchmark).
Source: OpenRouter
Metrics: context_length_desc + avg_price_asc + owner_diversity
Update cadence: daily
Fetched at: 2026-05-23T13:51:52.987Z
Data version: v20260523T135152Z
Data size: 20
Method: webx-ranking-v2
OpenRouter catalog snapshot: the shortlist is built with largest context window first, then lowest average USD per 1M tokens (prompt + completion), prioritizing one model per vendor before filling remaining slots. Listing score blends normalized context and price within that shortlist. The table is sorted by Score (high to low) so Rank matches the Score column; ties follow the catalog shortlist order.