Methodology — How We Evaluate & Rank AI Models

AI Analysis evaluates and ranks 390+ AI models using a comprehensive multi-dimensional framework covering intelligence, speed, pricing, and specialized use cases.

Core Evaluation Dimensions

Intelligence Index
Composite score based on benchmark performance across reasoning, knowledge, coding, and multi-modal tasks. Normalized to 0-100 scale.
Speed (Tokens per Second)
Measured throughput for response generation. Higher TPS means faster responses.
Pricing
Cost per million tokens for both input and output, including volume discounts and batch pricing.
Context Window
Maximum number of tokens the model can process in a single request.

Specialized Frameworks

Healthcare AI Evaluation (7 Metrics)

CDUI, CSHI, RRS, CDCC, HEBTS, PISI, CCRI — proprietary metrics for clinical safety assessment.

Coding AI Evaluation

HumanEval, SWE-Bench, and other coding-specific benchmark comparisons.

Data Sources

  • Artificial Analysis API — Real-time performance and pricing data
  • OpenRouter API — Provider coverage and external benchmarks
  • Provider Documentation — Official specifications and capabilities
  • Community Benchmarks — Independent evaluation results