Which AI model is actually the best? We aggregate 20+ benchmarks so you don't have to.
Tired of cherry-picked benchmarks and marketing hype? Showdown provides transparent, community-maintained rankings of AI language models across real-world categories:
All data is open. All methodology is transparent. All contributions are welcome.
Visit showdown.best to explore the rankings.
Want to run it locally?
git clone https://github.com/verseles/showdown.git
cd showdown
npm install
npm run dev
We aggregate scores from 20+ industry benchmarks, weighted by practical importance:
| Category | Weight | What it measures |
|---|---|---|
| Coding | 25% | Real GitHub issues, live coding challenges |
| Reasoning | 25% | PhD science questions, novel problem solving |
| Agents & Tools | 18% | API usage, multi-step tasks, browser automation |
| Conversation | 12% | Creative writing, following complex instructions |
| Math | 10% | Competition math, word problems |
| Multimodal | 7% | Understanding images, charts, diagrams |
| Multilingual | 3% | Performance across languages |
Scoring:
Open an issue with the correct value and source.
Open an issue with available benchmark scores.
data/showdown.jsonRankings aggregate data from trusted sources:
AGPL-3.0 - Keep it open!
Built with Svelte. Hosted on Cloudflare. Made for the community.