Methodology — How We Evaluate & Rank AI Models
AI Analysis evaluates and ranks 390+ AI models using a comprehensive multi-dimensional framework covering intelligence, speed, pricing, and specialized use cases.
Core Evaluation Dimensions
- Intelligence Index
- Composite score based on benchmark performance across reasoning, knowledge, coding, and multi-modal tasks. Normalized to 0-100 scale.
- Speed (Tokens per Second)
- Measured throughput for response generation. Higher TPS means faster responses.
- Pricing
- Cost per million tokens for both input and output, including volume discounts and batch pricing.
- Context Window
- Maximum number of tokens the model can process in a single request.
Specialized Frameworks
CDUI, CSHI, RRS, CDCC, HEBTS, PISI, CCRI — proprietary metrics for clinical safety assessment.
HumanEval, SWE-Bench, and other coding-specific benchmark comparisons.
Data Sources
- Artificial Analysis API — Real-time performance and pricing data
- OpenRouter API — Provider coverage and external benchmarks
- Provider Documentation — Official specifications and capabilities
- Community Benchmarks — Independent evaluation results