Quality Score
Weighted composite of five uderia workloads: RAG/Document Q&A (30%), Tool Use & Function Calling (25%), Long-Context Reasoning (20%), Structured Output / JSON (15%), Multi-Turn Conversation (10%). Scores derived from published benchmarks (GPQA, BFCL, SWE-bench, IFEval) and Chatbot Arena ELO where available.
Speed Score
Combines throughput (tokens/second, higher = better) and TTFT (time to first token, lower = better). Speed score = 60% throughput + 40% TTFT component, each normalized 0–100 across all 40 models. Matters most for uderia's interactive IDEATE profile and real-time streaming.
Price Efficiency
Blended price = input × 0.70 + output × 0.30 (uderia's typical token traffic ratio, reflecting 16K max output). Best available price for the selected provider type is used. Self-hosted models score 100. Score uses log scale so $0.01 vs $0.10 matters as much as $1 vs $10.
Tier Classification
L: Open-weight / self-hostable models (Gemma, Llama, Qwen, DeepSeek, Mistral open). M: Lite/Mini/Nano/Flash-Lite closed-source variants. H: Standard/Flash/Sonnet-class models. E: Pro/Opus/flagship — best available from each provider. Classification is capability-based, not purely price-based.
Update Cadence
Refreshed monthly, or immediately following major model releases or pricing changes. At each update, provider pricing pages, Artificial Analysis benchmarks, and Chatbot Arena rankings are re-verified against primary sources to ensure accuracy.