| 1 | Claude Opus 4.8 (Adaptive Reasoning, Max Effort) | Anthropic | 61.4 | 65.9 | $10.94 |
| 2 | GPT-5.5 (xhigh) | OpenAI | 60.2 | 79.0 | $11.25 |
| 3 | GPT-5.5 (high) | OpenAI | 58.9 | 64.4 | $11.25 |
| 4 | Claude Opus 4.7 (Adaptive Reasoning, Max Effort) | Anthropic | 57.3 | 51.7 | $10.94 |
| 5 | Gemini 3.1 Pro Preview | Google | 57.2 | 119.5 | $4.50 |
| 6 | GPT-5.4 (xhigh) | OpenAI | 56.8 | 89.3 | $5.63 |
| 7 | GPT-5.5 (medium) | OpenAI | 56.7 | 61.0 | $11.25 |
| 8 | Qwen3.7 Max | Alibaba | 56.6 | 200.5 | $3.75 |
| 9 | Gemini 3.5 Flash (high) | Google | 55.3 | 226.5 | $3.38 |
| 10 | Gemini 3.5 Flash (medium) | Google | 54.8 | 222.5 | $3.38 |
| 11 | MiMo-V2.5-Pro | Xiaomi | 53.8 | 51.4 | $0.54 |
| 12 | GPT-5.3 Codex (xhigh) | OpenAI | 53.6 | 72.3 | $4.81 |
| 13 | Grok 4.3 (high) | xAI | 53.2 | 215.7 | $1.56 |
| 14 | Claude Opus 4.6 (Adaptive Reasoning, Max Effort) | Anthropic | 52.9 | 50.3 | $10.94 |
| 15 | Muse Spark | Meta | 52.2 | 0.0 | Free |
| 16 | Qwen3.6 Max Preview | Alibaba | 51.8 | 37.1 | $2.92 |
| 17 | Claude Opus 4.7 (Non-reasoning, High Effort) | Anthropic | 51.8 | 45.8 | $10.94 |
| 18 | Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) | Anthropic | 51.7 | 64.8 | $6.56 |
| 19 | DeepSeek V4 Pro (Reasoning, Max Effort) | DeepSeek | 51.5 | 54.0 | $0.54 |
| 20 | GLM-5.1 (Reasoning) | Z AI | 51.4 | 57.9 | $2.15 |