LLM ランキング

大規模言語モデルの総合力（サンプルデータ）。

Composite scores may decompose into reasoning, coding, multilingual, and safety dimensions; Methodology must cite benchmark versions and tool-use policy.

更新: 2026-04-03

Public ranking policy: rows are sorted by composite score (desc). Composite score is a weighted sum of normalized sub-metrics; ties are broken by higher recent activity.

順位	モデル	ベンダー	規模	スコア	メモ
1	Nova-Large-2	Nova AI	~400B MoE	95	推論モード
2	Summit-Pro	Summit	~200B	93.4	指示追従が強い
3	DeepLine-R1	DeepLine	~70B	91.9	オープンウェイト
4	Cedar-32B	Cedar	32B	89.7	中国語/英語のバランス
5	Birch-Mini	Birch	8B	87.3	オンデバイス展開
6	Fjord-1.5	Fjord Labs	14B	86.1	ツール呼び出し
7	Ridge-Code	Ridge	33B	85	コード特化
8	Willow-Base	Willow	3B	82.4	超低レイテンシ