All models

Live speed benchmarks for every model across all providers. Numbers update continuously — sorted by latest tokens per second within each provider.

54 total models benchmarked

47 currently available

3 providers

ollama

Model	TPS now	TPS 24h avg	TTFT	Reliability	Intelligence Index
1 gpt-oss:20b	1115.2	111.2	6.7s	100%	14.9
2 deepseek-v4-flash	253.8	191.6	514ms	100%	40.3
3 ministral-3:3b (non-reasoning)	242.6	192.9	442ms	100%	5.6
4 gpt-oss:120b	216.7	140.3	416ms	100%	23.8
5 qwen3-coder:480b (non-reasoning)	194.6	91.5	740ms	100%	18
6 kimi-k2.5	182.6	128.7	1.9s	100%	38.1
7 nemotron-3-nano:30b (non-reasoning)	173.9	270.1	435ms	100%	7.4
8 deepseek-v4-pro	172.8	155.3	718ms	100%	44.3
9 glm-5.1	158.1	146.5	822ms	100%	40.2
10 ministral-3:8b (non-reasoning)	157.3	131.7	472ms	100%	8.9
11 glm-5	135.1	103.1	701ms	100%	39.5
12 kimi-k2.7-code	134.0	139.5	686ms	100%	41.9
13 minimax-m2.1	132.7	130.9	—	100%	31.4
14 kimi-k2.6	127.0	91.2	630ms	100%	42.8
15 rnj-1:8b	126.6	126.4	522ms	100%	—
16 gemini-3-flash-preview	118.2	113.7	1.7s	100%	37.8
17 gemma4:31b	114.7	119.1	410ms	100%	29.4
18 ministral-3:14b (non-reasoning)	111.7	102.2	428ms	100%	10
19 nemotron-3-super	103.5	53.8	451ms	100%	25.4
20 qwen3-coder-next (non-reasoning)	94.3	94.0	409ms	100%	21.2
21 minimax-m2.5	89.7	77.5	1.3s	100%	33.7
22 qwen3.5:397b	79.5	94.5	407ms	100%	33.7
23 gemma3:4b (non-reasoning)	75.7	50.7	547ms	97%	1.1
24 glm-5.2	75.2	110.1	848ms	100%	50.7
25 devstral-small-2:24b (non-reasoning)	72.5	58.9	2.0s	100%	13.1
26 devstral-2:123b (non-reasoning)	63.4	61.8	544ms	100%	15.5
27 glm-4.7	62.6	84.1	1.6s	100%	33.8
28 mistral-large-3:675b (non-reasoning)	60.2	58.9	742ms	100%	16.2
29 minimax-m3	45.0	61.1	741ms	100%	44.4
30 minimax-m2.7	42.8	39.8	872ms	100%	38.1
31 nemotron-3-ultra	40.8	18.7	971ms	98%	37.8
32 deepseek-v3.2	23.9	26.9	865ms	99%	33.4
33 gemma3:12b (non-reasoning)	19.7	37.9	632ms	100%	3.4
34 gemma3:27b (non-reasoning)	15.4	16.1	585ms	100%	4.8
35 deepseek-v3.1:671b (non-reasoning)	9.0	10.0	1.1s	100%	21

opencode-zen

Model	TPS now	TPS 24h avg	TTFT	Reliability	Intelligence Index
1 MiMo V2.5 (Free)	292.5	299.7	4.0s	90%	—
2 North Mini Code (Free)	125.3	390.9	258ms	82%	20.6
3 Big Pickle	93.4	88.5	857ms	40%	—
4 DeepSeek V4 Flash (Free)	91.4	191.1	1.0s	41%	40.3
5 Nemotron 3 Ultra (Free)	10.4	16.9	28.8s	48%	—

opencode-go

Model	TPS now	TPS 24h avg	TTFT	Reliability	Intelligence Index
1 GLM-5.1	107.6	104.0	504ms	48%	40.2
2 DeepSeek V4 Flash	92.5	89.8	995ms	46%	40.3
3 MiMo V2.5	71.1	154.3	2.8s	83%	40.3
4 DeepSeek V4 Pro	65.6	73.4	1.2s	46%	44.3
5 Kimi K2.7 Code	54.0	56.1	1.1s	83%	41.9
6 GLM-5.2	53.5	79.1	2.2s	48%	50.7
7 MiMo V2.5 Pro	41.8	59.8	1.8s	85%	42.2

7 models not currently benchmarked

MiniMax M2.7 — opencode-go-anthropic
MiniMax M2.5 — opencode-go-anthropic
Qwen3.7 Max — opencode-go-anthropic
Qwen3.7 Plus — opencode-go-anthropic
Qwen3.6 Plus — opencode-go-anthropic
MiniMax M3 — opencode-go-anthropic
Kimi K2.6 — opencode-go

How these numbers are measured

Every model is benchmarked with the same prompt, the same measurement method, and the same ~10-minute cadence — regardless of provider. That makes the numbers directly comparable across providers. See the methodology page for the full measurement spec.