Browse models by modality

Text (LLM), image, video, and multimodal—alongside the combined Model board.

Model leaderboard (all)

Model leaderboard

Cross-task performance (multimodal / vision / language)—sample data; replace with production eval output.

Cross-task overview; multimodal, vision, and language may split into additional columns or child boards once eval JSON is wired.

Updated:

Public ranking policy: rows are sorted by composite score (desc). Composite score is a weighted sum of normalized sub-metrics; ties are broken by higher recent activity.

RankModelVendor / teamTypeScoreNotes
1 Demo-Vision-Pro Demo Lab Multimodal 94.2 Balanced image + text
2 NorthStar-MM North AI Multimodal 92.8 Strong long-context scenarios
3 Aurora-VL-7B Aurora Vision-language 91.5 Edge-friendly
4 Helix-3 Helix Research General 90.1 Stable tool calling
5 Kite-Small Kite Language 88.6 Strong price/performance
6 Lattice-R1 Lattice Reasoning 87.9 Strong math/code subscores
7 Pulse-Audio-2 Pulse Speech multimodal 86.4 ASR/TTS combined
8 Quark-Mini Quark Systems Language 85.2 Low latency