LLM tokens per second — live multi-provider benchmark

Real inference speed, measured continuously. Every row is a live model from Ollama, OpenCode Zen, or OpenCode Go — sorted by tokens per second, benchmarked every ~10 minutes.

● live — last benchmark 6s ago
Trend 24h
North Mini Code (Free) OpenCode Zen 494.4 329.6 3.1s 63% 20.6 6m ago
deepseek-v4-flash Ollama 251.7 182.7 488ms 100% 40.3 3m ago
MiMo V2.5 (Free) OpenCode Zen 216.8 313.7 3.9s 75% 8m ago
nemotron-3-nano:30b (non-reasoning) Ollama 203.9 293.1 461ms 100% 7.4 10m ago
deepseek-v4-pro Ollama 164.7 156.3 816ms 100% 44.3 3m ago
kimi-k2.5 Ollama 157.1 128.0 908ms 100% 38.1 51s ago
ministral-3:8b (non-reasoning) Ollama 149.0 132.1 448ms 100% 8.9 11m ago
kimi-k2.7-code Ollama 148.8 138.0 890ms 100% 41.9 38s ago
glm-5.2 Ollama 141.1 107.6 930ms 100% 50.7 1m ago
gpt-oss:120b Ollama 128.1 145.0 458ms 100% 23.8 1m ago
rnj-1:8b Ollama 126.7 126.3 397ms 100% 8m ago
gemma4:31b Ollama 126.6 119.3 358ms 100% 29.4 2m ago
glm-5 Ollama 124.4 98.4 1.1s 100% 39.5 1m ago
qwen3.5:397b Ollama 120.3 96.5 567ms 100% 33.7 8m ago
gemini-3-flash-preview Ollama 106.1 111.8 1.8s 100% 37.8 2m ago
minimax-m2.1 Ollama 104.7 126.4 100% 31.4 32s ago
gpt-oss:20b Ollama 103.9 108.8 779ms 100% 14.9 57s ago
kimi-k2.6 Ollama 102.7 94.3 653ms 100% 42.8 45s ago
minimax-m3 Ollama 101.6 61.4 907ms 100% 44.4 11m ago
qwen3-coder-next (non-reasoning) Ollama 99.2 92.4 344ms 100% 21.2 9m ago
ministral-3:14b (non-reasoning) Ollama 97.2 102.5 457ms 100% 10 11m ago
glm-5.1 Ollama 96.5 156.7 3.2s 100% 40.2 1m ago
nemotron-3-super Ollama 94.4 56.1 463ms 100% 25.4 10m ago
ministral-3:3b (non-reasoning) Ollama 91.1 187.0 487ms 100% 5.6 11m ago
minimax-m2.5 Ollama 89.5 77.7 304ms 100% 33.7 26s ago
glm-4.7 Ollama 88.3 87.8 1.6s 100% 33.8 1m ago
devstral-2:123b (non-reasoning) Ollama 76.6 60.1 567ms 100% 15.5 3m ago
devstral-small-2:24b (non-reasoning) Ollama 53.5 58.7 550ms 100% 13.1 2m ago
gemma3:4b (non-reasoning) Ollama 51.1 51.1 539ms 97% 1.1 2m ago
mistral-large-3:675b (non-reasoning) Ollama 47.7 58.5 629ms 100% 16.2 11m ago
gemma3:12b (non-reasoning) Ollama 45.4 37.8 645ms 100% 3.4 2m ago
minimax-m2.7 Ollama 37.2 39.5 1.2s 100% 38.1 11m ago
qwen3-coder:480b (non-reasoning) Ollama 32.0 89.8 556ms 100% 18 9m ago
gemma3:27b (non-reasoning) Ollama 18.5 16.0 613ms 100% 4.8 2m ago
nemotron-3-ultra Ollama 12.0 17.8 19.2s 99% 37.8 10m ago
deepseek-v3.1:671b (non-reasoning) Ollama 11.7 10.2 3.3s 100% 21 5m ago
deepseek-v3.2 Ollama 3.9 25.4 1.0s 99% 33.4 4m ago

Intelligence Index scores from Artificial Analysis.

12 models currently unavailable