Last updated: há 1 minuto

Infrastructure

Latency and price differences for the same model across providers. Sample snapshot only.

Prices on this page are listed in USD per 1M tokens unless a column says otherwise.

Same-model comparison

Fastest · Groq

200ms

P50 latency

Fastest

Best price · OpenRouter

$0.15 / $0.60

Input / 1M · Output / 1M

Providers are sorted by a balanced score: lower latency and lower average of input/output price per 1M rank higher; official badges break ties.

1

Groq

Fastest

Input / 1M

$0.15

Output / 1M

$0.60

P50 latency

200ms
2

OpenRouter

Input / 1M

$0.15

Output / 1M

$0.60

P50 latency

380ms
3

Fireworks

Input / 1M

$0.16

Output / 1M

$0.62

P50 latency

310ms
4

OpenAI

Input / 1M

$0.15

Output / 1M

$0.60

P50 latency

420ms
5

Azure OpenAI

Input / 1M

$0.15

Output / 1M

$0.60

P50 latency

450ms
6

Together

Input / 1M

$0.18

Output / 1M

$0.64

P50 latency

340ms

Fastest · Groq

180ms

P50 latency

Fastest

Best price · DeepSeek

$0.11 / $0.28

Input / 1M · Output / 1M

Best price

Providers are sorted by a balanced score: lower latency and lower average of input/output price per 1M rank higher; official badges break ties.

1

Groq

Fastest

Input / 1M

$0.22

Output / 1M

$0.35

P50 latency

180ms
2

DeepSeek

Best price

Input / 1M

$0.11

Output / 1M

$0.28

P50 latency

350ms
3

Fireworks

Input / 1M

$0.16

Output / 1M

$0.30

P50 latency

280ms
4

Lepton

Input / 1M

$0.17

Output / 1M

$0.30

P50 latency

300ms
5

DeepInfra

Input / 1M

$0.17

Output / 1M

$0.31

P50 latency

320ms
6

Together

Input / 1M

$0.18

Output / 1M

$0.32

P50 latency

310ms
7

Nebius

Input / 1M

$0.18

Output / 1M

$0.32

P50 latency

330ms
8

Cerebras Inference

Input / 1M

$0.23

Output / 1M

$0.37

P50 latency

260ms
9

Hyperbolic

Input / 1M

$0.19

Output / 1M

$0.33

P50 latency

340ms
10

Novita

Input / 1M

$0.15

Output / 1M

$0.29

P50 latency

400ms
11

FriendliAI

Input / 1M

$0.19

Output / 1M

$0.32

P50 latency

350ms
12

SambaNova

Input / 1M

$0.24

Output / 1M

$0.38

P50 latency

290ms
13

fal

Input / 1M

$0.19

Output / 1M

$0.33

P50 latency

370ms
14

Baseten

Input / 1M

$0.20

Output / 1M

$0.34

P50 latency

360ms
15

Anyscale

Input / 1M

$0.20

Output / 1M

$0.34

P50 latency

380ms
16

Mistral La Plateforme

Input / 1M

$0.20

Output / 1M

$0.35

P50 latency

410ms
17

SiliconFlow

Domestic payment

Input / 1M

$0.12

Output / 1M

$0.24

P50 latency

580ms
18

Perplexity

Input / 1M

$0.21

Output / 1M

$0.36

P50 latency

430ms
19

Replicate

Input / 1M

$0.21

Output / 1M

$0.36

P50 latency

450ms
20

OpenRouter

Input / 1M

$0.15

Output / 1M

$0.75

P50 latency

420ms