All models
Live speed benchmarks for every model across all providers. Numbers update continuously — sorted by latest tokens per second within each provider.
ollama
| Model | TPS now | TPS 24h avg | TTFT | Reliability | Intelligence Index |
|---|---|---|---|---|---|
| 1 gpt-oss:20b | 1115.2 | 111.2 | 6.7s | 100% | 14.9 |
| 2 deepseek-v4-flash | 253.8 | 191.6 | 514ms | 100% | 40.3 |
| 3 ministral-3:3b (non-reasoning) | 242.6 | 192.9 | 442ms | 100% | 5.6 |
| 4 gpt-oss:120b | 216.7 | 140.3 | 416ms | 100% | 23.8 |
| 5 qwen3-coder:480b (non-reasoning) | 194.6 | 91.5 | 740ms | 100% | 18 |
| 6 kimi-k2.5 | 182.6 | 128.7 | 1.9s | 100% | 38.1 |
| 7 nemotron-3-nano:30b (non-reasoning) | 173.9 | 270.1 | 435ms | 100% | 7.4 |
| 8 deepseek-v4-pro | 172.8 | 155.3 | 718ms | 100% | 44.3 |
| 9 glm-5.1 | 158.1 | 146.5 | 822ms | 100% | 40.2 |
| 10 ministral-3:8b (non-reasoning) | 157.3 | 131.7 | 472ms | 100% | 8.9 |
| 11 glm-5 | 135.1 | 103.1 | 701ms | 100% | 39.5 |
| 12 kimi-k2.7-code | 134.0 | 139.5 | 686ms | 100% | 41.9 |
| 13 minimax-m2.1 | 132.7 | 130.9 | — | 100% | 31.4 |
| 14 kimi-k2.6 | 127.0 | 91.2 | 630ms | 100% | 42.8 |
| 15 rnj-1:8b | 126.6 | 126.4 | 522ms | 100% | — |
| 16 gemini-3-flash-preview | 118.2 | 113.7 | 1.7s | 100% | 37.8 |
| 17 gemma4:31b | 114.7 | 119.1 | 410ms | 100% | 29.4 |
| 18 ministral-3:14b (non-reasoning) | 111.7 | 102.2 | 428ms | 100% | 10 |
| 19 nemotron-3-super | 103.5 | 53.8 | 451ms | 100% | 25.4 |
| 20 qwen3-coder-next (non-reasoning) | 94.3 | 94.0 | 409ms | 100% | 21.2 |
| 21 minimax-m2.5 | 89.7 | 77.5 | 1.3s | 100% | 33.7 |
| 22 qwen3.5:397b | 79.5 | 94.5 | 407ms | 100% | 33.7 |
| 23 gemma3:4b (non-reasoning) | 75.7 | 50.7 | 547ms | 97% | 1.1 |
| 24 glm-5.2 | 75.2 | 110.1 | 848ms | 100% | 50.7 |
| 25 devstral-small-2:24b (non-reasoning) | 72.5 | 58.9 | 2.0s | 100% | 13.1 |
| 26 devstral-2:123b (non-reasoning) | 63.4 | 61.8 | 544ms | 100% | 15.5 |
| 27 glm-4.7 | 62.6 | 84.1 | 1.6s | 100% | 33.8 |
| 28 mistral-large-3:675b (non-reasoning) | 60.2 | 58.9 | 742ms | 100% | 16.2 |
| 29 minimax-m3 | 45.0 | 61.1 | 741ms | 100% | 44.4 |
| 30 minimax-m2.7 | 42.8 | 39.8 | 872ms | 100% | 38.1 |
| 31 nemotron-3-ultra | 40.8 | 18.7 | 971ms | 98% | 37.8 |
| 32 deepseek-v3.2 | 23.9 | 26.9 | 865ms | 99% | 33.4 |
| 33 gemma3:12b (non-reasoning) | 19.7 | 37.9 | 632ms | 100% | 3.4 |
| 34 gemma3:27b (non-reasoning) | 15.4 | 16.1 | 585ms | 100% | 4.8 |
| 35 deepseek-v3.1:671b (non-reasoning) | 9.0 | 10.0 | 1.1s | 100% | 21 |
opencode-zen
| Model | TPS now | TPS 24h avg | TTFT | Reliability | Intelligence Index |
|---|---|---|---|---|---|
| 1 MiMo V2.5 (Free) | 292.5 | 299.7 | 4.0s | 90% | — |
| 2 North Mini Code (Free) | 125.3 | 390.9 | 258ms | 82% | 20.6 |
| 3 Big Pickle | 93.4 | 88.5 | 857ms | 40% | — |
| 4 DeepSeek V4 Flash (Free) | 91.4 | 191.1 | 1.0s | 41% | 40.3 |
| 5 Nemotron 3 Ultra (Free) | 10.4 | 16.9 | 28.8s | 48% | — |
opencode-go
| Model | TPS now | TPS 24h avg | TTFT | Reliability | Intelligence Index |
|---|---|---|---|---|---|
| 1 GLM-5.1 | 107.6 | 104.0 | 504ms | 48% | 40.2 |
| 2 DeepSeek V4 Flash | 92.5 | 89.8 | 995ms | 46% | 40.3 |
| 3 MiMo V2.5 | 71.1 | 154.3 | 2.8s | 83% | 40.3 |
| 4 DeepSeek V4 Pro | 65.6 | 73.4 | 1.2s | 46% | 44.3 |
| 5 Kimi K2.7 Code | 54.0 | 56.1 | 1.1s | 83% | 41.9 |
| 6 GLM-5.2 | 53.5 | 79.1 | 2.2s | 48% | 50.7 |
| 7 MiMo V2.5 Pro | 41.8 | 59.8 | 1.8s | 85% | 42.2 |
7 models not currently benchmarked
- MiniMax M2.7 — opencode-go-anthropic
- MiniMax M2.5 — opencode-go-anthropic
- Qwen3.7 Max — opencode-go-anthropic
- Qwen3.7 Plus — opencode-go-anthropic
- Qwen3.6 Plus — opencode-go-anthropic
- MiniMax M3 — opencode-go-anthropic
- Kimi K2.6 — opencode-go
How these numbers are measured
Every model is benchmarked with the same prompt, the same measurement method, and the same ~10-minute cadence — regardless of provider. That makes the numbers directly comparable across providers. See the methodology page for the full measurement spec.