LLM Artificial Analysis Index >08.2025<









đhttps://artificialanalysis.ai
đ˘ OpenAI (USA)
- GPTâ4o: OpenAIâs flagship multimodal model with ~1.8 trillion parameters and ~128K token context window. High performance, fast inference, and multimedia support (explodingtopics.com, codingscape.com).
- GPTâ4.1: A follow-up to GPTâ4 (released April 2025) with a 1 million token context window, optimized for reasoning, vision, and coding workloads (turn0search1, turn0search3).
- GPTâ5 (in development): Expected release imminently (as of August 6, 2025), promising enhanced reasoning and âtestâtime computeâ capabilities (turn0news20, turn0search16).
- GPTâOSSâ120B & GPTâOSSâ20B: Openâweight models released August 5, 2025. The 20B model runs on consumer laptops (~16âŻGB VRAM); the 120B model requires a single GPU. Both perform comparably to OpenAIâs proprietary o3âmini and o4âmini models (turn0news25, turn0news24, turn0news23).
đ Google DeepMind / Google
- Gemini 2.5 Pro: Released June 2025 with ~1 million token context window, full multimodal support (text, image, video, audio), top-tier benchmark performance (turn0search1, turn0search3, turn0search8).
- Gemini 2.5 Flash / FlashâLite: Cost- and latency-optimized variants with same massive context window and multimodal capabilities (turn0search3, turn0search8).
- Gemma 3 / Gemma 3n (open): Googleâs open models (1B to 27B parameters), multilingual and multimodal, designed for on-device deployment (turn0search35, turn0search14).
đ¤ Anthropic
- Claude 3.7 Sonnet and Claude Opus 4 (released May 2025): Proprietary, multimodal, 200K token context window, focused on safety, chat-based conversational and reasoning tasks (turn0search4, turn0search3, turn0search16).
đ DeepSeekâAI (China)
- DeepSeekâV3 and DeepSeekâR1 (R1â0528): Openâsource (openâweight), ~671B parameters with ~37B active per token, 128K token context window. Excellent on coding, math, reasoning benchmarks; V3 released Dec 2024, R1 in May 2025 (turn0search36, turn0search3).
đ´đ˛ Technology Innovation Institute (UAE)
- FalconâŻ180B: Openâweight 180B parameter model, Apacheâ2.0 licensed, strong multilingual and coding benchmarks, widely used in open deployments (turn0search37, turn0search3).
âď¸ Alibaba Cloud
- QwenâŻ3â235B: Released AprilâJuly 2025, openâweight (Apacheâ2.0), both dense and MoE variants (~22B active parameters), strong performance in Chinese/English and reasoning tasks (turn0search38, turn0search3).
- Other Qwen models include QwenâŻ2.5 (dense and multimodal), up to ~72B parameters (turn0search36, [turn0search38]).
đŹ Mistral AI (France)
- Mistral LargeâŻ2: Openâweight ~123B parameter model with 128K token context window. Known for coding prowess and efficiency. Released late 2024, research license available (turn0search37).
- Pixtral Large: Multimodal variant (124B + image encoder), research license (turn0search37).
đ§Ş Other Notable Models
- Nemotronâ4 (NVIDIA): ~340B parameters, openâweight under NVIDIA license; enterprise focus, less widely adopted than Falcon or LLaMA (turn0search14).
- Gemma 3 family (Google open): Highly popular smallâtoâmedium models, multilingual multimodal performance, deployed widely via Hugging Face (turn0search35).
- MiniMaxâTextâ01 (Minimax): ~456B openâsource model released in early 2025 (turn0search36).
â Summary Table
Company / Model | Proprietary or OpenâWeight/Source | Approx. Params | Context Window | Highlights |
---|---|---|---|---|
OpenAI â GPTâ4o | Proprietary | ~1.8T | ~128K | Multimodal general-purpose flagship |
OpenAI â GPTâ4.1 | Proprietary | Not disclosed | ~1M tokens | Reasoning and vision across domains |
OpenAI â GPTâOSSâ120B/20B | OpenâWeight (Apache 2.0) | 120B / 20B | ~200K | First openâweight models from OpenAI since 2019 |
Google â Gemini 2.5 Pro | Proprietary | Not disclosed | ~1M tokens | Multimodal, extreme context, reasoning leader |
Google â Gemma 3 / 3n | OpenâSource | 1â27B | ~128K | Lightweight, multilingual, on-device capable |
Anthropic â Claude Opus/3.7 | Proprietary | Not disclosed | ~200K | Safety-first, complex reasoning and chat-oriented |
DeepSeek â V3 / R1 | OpenâWeight / Open Source | 671B (37B active) | ~128K | Strong benchmarks, efficient sparse architecture |
Alibaba â Qwen 3â235B | OpenâWeight (Apache 2.0) | 235B / ~22B active | ~262K | Multilingual, large context, Chinese/English |
TII â Falcon 180B | OpenâWeight (Apache 2.0) | 180B | ~2K | Efficient, widely adopted open model |
Mistral â LargeâŻ2 / Pixtral | OpenâWeight / Research License | 123â124B | ~128K | Coding optimization, multimodal via Pixtral |
NVIDIA â Nemotronâ4 | OpenâWeight / NVIDIA license | ~340B | ~8K | Enterprise-grade open model |
Minimax â MiniMaxâTextâ01 | OpenâSource | 456B | ~4M tokens | Large open-text model with unique context length |
đ Takeaways
- Most powerful proprietary: OpenAI GPTâ4o, GPTâ4.1, Gemini 2.5 Pro, Claude Opus 4.
- Most significant open source / openâweight: DeepSeek V3/R1, Qwen 3â235B, Mistral LargeâŻ2, Falcon 180B, Nemotronâ4, and Gemma 3.
- Highest context windows: GPTâ4.1 and Gemini 2.5 Pro (~1M tokens).
- Accessibility: GPTâOSS models and openâweight releases by DeepSeek, Qwen, Mistral, and Mistralâs Pixtral offer full-download and fine-tuning capability.
BEST OPEN SOURCE
Model | Company | ~Params (Active) | License | Why It Stands Out |
---|---|---|---|---|
DeepSeekâV3 / R1 | DeepSeekâAI | 671B (37B active) | MIT (fully open) | Top benchmark scores; ultra-efficient; widely replicable (GitHub, Financial Times) |
LLaMAâŻ4 | Meta AI | ~405B (dense) | Meta License | Broad community use, flexible deployment (Instaclustr, Wikipedia) |
FalconâŻ180B | TII (UAE) | 180B (dense) | Apacheâ2.0 | Strong multilingual capabilities, open access (Bestarion, Reddit) |
QwenâŻ3â235B | Alibaba Cloud | 235B (dense) / MoE | Apacheâ2.0 | Competitive in multi-language tasks; commercialâgrade quality (Exploding Topics, Bestarion) |
DBRX | Databricks / Mosaic ML | ~unknown large (MoE) | Open Model License | Excellent reasoning/math performance benchmarks (Wikipedia, Exploding Topics) |
Nemotronâ4 | NVIDIA | ~340B | NVIDIA license | Open release by major tech company; still niche adoption (Wikipedia, Bestarion) |
BIGGEST
Rank | Model | Params (approx.) | License |
---|---|---|---|
1 | PanGuâÎŁ | ~1.085âŻT | Proprietary |
2 | GPTâ4o | ~1.8âŻT (est.) | Proprietary (API) |
3 | DeepSeekâV3 / R1 | 671âŻB (37âŻB active) | Open source |
4 | QwenâŻ3 | 235âŻB | Open source |
5 | LlamaâŻ4 | up to 405âŻB | Open source |
6 | GLMâ4.5 | ~355âŻB | Proprietary |
7 | GrokâŻ3 | ~314âŻB | Proprietary |
8 | Mistral Large 2 | ~123âŻB | Open / research |
9 | gptâossâ120B | 117âŻB | Open (Apache 2.0) |
⌠| Others⌠| Various larger / smaller | Mixed |