Grok 4 Fast Should Be Impossible

September 21, 2025

TL;DR: xAI’s Grok 4 Fast is a new, 2M‑context multimodal model that lands #1 on LMArena’s Search Arena and top‑10 on the Text Arena—while launching at $0.20 / 1M input and $0.50 / 1M output tokens. It’s free for a limited time on OpenRouter and Vercel AI Gateway, and early signals point to reinforcement‑learning (RL) infrastructure—and lots of compute—behind the jump. Vercel+4xAI+4LMArena+4

Links referenced in this post:

Lech Mazur on Grok’s Connections benchmark results (NYT Connections “Extended”) → X (formerly Twitter)
John Boccio (xAI RL Infrastructure) on the new agent framework used in Grok 4 Fast’s training run → X (formerly Twitter)
Dustin Tran (ex‑Google DeepMind) on joining xAI → X (formerly Twitter)+1

NATURAL 20

Loading AI news feed...

What launched

xAI introduced Grok 4 Fast, a unified model with two API SKUs:

grok‑4‑fast‑reasoning and grok‑4‑fast‑non‑reasoning, both with a 2,000,000‑token context window. The “reasoning” vs “non‑reasoning” behavior is steered by prompts but uses the same weights, so you don’t juggle different models. xAI
Pricing (xAI API):$0.20 / 1M input, $0.50 / 1M output (cached input: $0.05 / 1M). Higher rates apply only for requests exceeding 128K context. Live search is billed $25 / 1K sources. xAI Docs
Availability: In Grok on web/iOS/Android, plus for a limited time it’s free via OpenRouter and Vercel AI Gateway. xAI+2OpenRouter+2

xAI’s announcement emphasizes large‑scale reinforcement learning and “tool‑use RL” (e.g., when to browse/code) to maximize intelligence per token (“intelligence density”)—claiming ~40% fewer thinking tokens versus Grok 4 at comparable accuracy and a ~98% reduction in price to match Grok 4’s frontier results. xAI

How good is it (so far)?

LMArena: #1 in Search, top‑10 in Text

Search Arena:grok‑4‑fast‑search is #1 with an Elo of 1163 (preliminary), edging out o3‑search, gpt‑5‑search, and gemini‑2.5‑pro‑grounding. As always, these are blind head‑to‑head votes where humans compare anonymous model outputs. Early, but notable. LMArena+1
Text Arena:grok‑4‑fast currently sits 8th in the overall Text leaderboard snapshot—impressive for a “fast”/cost‑efficient model tier. (Positions shift as new votes roll in.) LMArena

xAI’s post highlights Search/Browsing evals (BrowseComp, SimpleQA, etc.), where Grok 4 Fast claims SOTA‑level agentic search behavior; that aligns with the early LMArena Search result above. xAI

Benchmarks evolve and ratings can move as votes accumulate. Treat the Search #1 and Text top‑10 as very promising but provisional snapshots.

Why the jump? (Likely) RL at scale + infrastructure

Two tea leaves:

RL Agent Framework: John Boccio (xAI RL Infra) says a new agent framework underpinned the Grok 4 Fast training run and will power future RL training—hinting at process/scaling wins in RL post‑training. X (formerly Twitter)
Talent & Compute: Dustin Tran (8 years at Google Brain/DeepMind; RL/evals/data) announced he’s joined xAI—his thread underscores deep focus on RL/evals and, implicitly, a lot of chips. Meanwhile, Colossus (xAI’s Memphis supercomputer program) is publicly positioned as a record‑scale cluster built and scaled at unusual speed, with reporting around hundreds of thousands of GPUs. It’s reasonable to infer the RL budget is substantial. X (formerly Twitter)+2xAI+2

Put together: process + people + (a lot of) compute makes Grok 4 Fast’s “fast/cheap yet very strong” landing less mysterious.

Pricing & availability (developer quick facts)

xAI API model IDs:grok-4-fast-reasoning and grok-4-fast-non-reasoning (2M context). xAI Docs
Prices: $0.20 / 1M input; $0.50 / 1M output; $0.05 / 1M cached input; Live search $25 / 1K sources (tiers only above 128K context). xAI Docs
Free access (limited time): OpenRouter has x-ai/grok-4-fast:free; Vercel AI Gateway also lists Grok 4 Fast in its model library and playground. OpenRouter+1
Rollout in apps: Grok on web/iOS/Android uses Grok 4 Fast in Fast/Auto modes for searchy/information‑seeking queries. xAI

A visual that matters: price ↔ “intelligence” tradeoff

xAI points to an independent Artificial Analysis view showing Grok 4 Fast with a state‑of‑the‑art price‑to‑intelligence ratio (they even plot an “Intelligence vs. Price” curve). Regardless of whether you love that composite index, it’s another datapoint: frontier‑adjacent quality at a far lower run‑cost. xAI+1

The Connections thing (and why people noticed)

Lech Mazur reports Grok 4 Fast (Reasoning) set a new high on his Extended NYT Connections benchmark (92.1). That tracks with the broader narrative: RL‑hardened reasoning + agentic behaviors improving practical problem‑solving, not just static Q&A. (Benchmarks are community‑run; still a useful directional signal.) X (formerly Twitter)

Why this release matters

Search is where assistants earn their keep. If Grok holds #1 in LMArena’s Search Arena as votes climb, it’s a material shift for research/productivity use cases that rely on multi‑hop browsing, citation, and source fusion. LMArena
The “fast tier” got upgraded. Grok 4 Fast lands near frontier models in text quality while undercutting many on price—reshaping the “cheap‑and‑quick” segment. LMArena+1
RL at scale may be the story of 2025. Boccio’s and Tran’s notes line up with a broader industry trend: post‑training RL (and agent training) becoming the dominant slab of compute—and a key differentiator. X (formerly Twitter)+1

How to try it (right now)

OpenRouter (temporary free): Select x-ai/grok-4-fast:free for quick tests (2M context). OpenRouter
xAI API: Use grok-4-fast-reasoning when you need deep chains of thought; grok-4-fast-non-reasoning for snappy responses under the same 2M context ceiling. Start at $0.20/$0.50 per 1M tokens, with caching to cut costs. xAI Docs

One more product note: “Read Aloud”

xAI/Grok added a Read Aloud mode—announced around the Grok 4 Fast window—which lets you hear responses in a natural voice. Handy for drive‑time or multitasking. (Announcement coverage linked below.) LatestLY+1

The human angle

Dustin Tran (RL/evals lead work across Gemini lines) is now at xAI; his thread reflecting on DeepMind years and this move drew big attention. X (formerly Twitter)
John Boccio says a new RL agent framework powered Grok 4 Fast’s training run and will anchor future RL runs, implying sustained investment in the approach that lifted this model. X (formerly Twitter)

What to watch next

LMArena stability: Will grok‑4‑fast-search hold the #1 as votes and confidence grow? Keep an eye on the Search and Text tabs. LMArena
API economics: At these prices, 2M‑context projects (RAG over large code/docs) become newly practical—watch for developer case studies. xAI Docs
RL scaling curve: If xAI keeps iterating the RL agent framework—and has Colossus scale behind it—expect more “fast but frontier‑ish” releases. X (formerly Twitter)+1

Sources & further reading

xAI news post: “Grok 4 Fast” (features, LMArena placement, tool‑use RL, free period on OpenRouter/Vercel, 2M context). xAI
xAI docs (pricing/specs for both SKUs): Grok 4 Fast Reasoning/Non‑Reasoning. xAI Docs+1
LMArena leaderboards:Search (Grok 4 Fast #1) and Text (Grok 4 Fast top‑10). Method: anonymous, pairwise votes. LMArena+1
OpenRouter (free period model page):x-ai/grok-4-fast:free. OpenRouter
Vercel AI Gateway (model library / playground): Grok 4 Fast listing. Vercel
Lech Mazur on Connections benchmark: Grok 4 Fast (Reasoning) 92.1. X (formerly Twitter)
John Boccio (xAI RL Infrastructure) on new agent framework for Grok 4 Fast training:X (formerly Twitter)
Dustin Tran (joins xAI after 8 years at DeepMind):X (formerly Twitter)+1
xAI Colossus page (training cluster):xAI
Artificial Analysis (pricing/“Intelligence Index” context): Model and provider pages. Artificial Analysis+1

RELEVANT LINKS:

Introducing Grok 4 Fast
https://x.com/LechMazur/status/1969229587734295020

Grok 4 RL Post Training Agent @JohnBoccio:
https://x.com/JohnBoccio/status/1969217698413625500

Starting at XAI after 8 years at Google DeepMind:
https://x.com/dustinvtran/status/1969183617881686405

Related Tools & Articles

code

Latest Articles

Vibe-Code Quest: How One Founder Built a Language-Learning Roguelike with Pure AI Magic

Sep 10, 2025

Grok 4 Fast Should Be Impossible

Grok 4 Fast Should Be Impossible

What launched

How good is it (so far)?

LMArena: #1 in Search, top‑10 in Text

Why the jump? (Likely) RL at scale + infrastructure

Pricing & availability (developer quick facts)

A visual that matters: price ↔ “intelligence” tradeoff

The Connections thing (and why people noticed)

Why this release matters

How to try it (right now)

One more product note: “Read Aloud”

The human angle

What to watch next

Sources & further reading

Related Tools & Articles

Why This 21-Year-Old Gave Up Fast Cash to Build the Future of AI

SinCode AI - AI Writing Tool

Grok 4.2 Spoted - Is Sonoma Sky Alpha Actually XAI’s Stealth Giant?

Profit Arena | When AIs Beat Humans at Predicting the Future

The State of AI Video in 2025: Veo 3, Runway Gen‑4, Midjourney Video, Pika, Luma & More

AI Village: Bots on a Mission, Humans Just Watching

Latest Articles

Why This 21-Year-Old Gave Up Fast Cash to Build the Future of AI

Sora 2 Unveiled—Is This OpenAI’s TikTok Killer?

They’re Not Lying—AI Progress Is Just Hard To See

GPT-5-Codex: The Complete Guide (Setup, Best Practices, and Why It Matters)

Vibe-Code Quest: How One Founder Built a Language-Learning Roguelike with Pure AI Magic

What launched

How good is it (so far)?

LMArena: #1 in Search, top‑10 in Text

Why the jump? (Likely) RL at scale + infrastructure

Pricing & availability (developer quick facts)

A visual that matters: price ↔ “intelligence” tradeoff

The Connections thing (and why people noticed)

Why this release matters

How to try it (right now)

One more product note: “Read Aloud”

The human angle

What to watch next

Sources & further reading

Related Tools & Articles

Why This 21-Year-Old Gave Up Fast Cash to Build the Future of AI

SinCode AI - AI Writing Tool

Grok 4.2 Spoted - Is Sonoma Sky Alpha Actually XAI’s Stealth Giant?

Profit Arena | When AIs Beat Humans at Predicting the Future

The State of AI Video in 2025: Veo 3, Runway Gen‑4, Midjourney Video, Pika, Luma & More

AI Village: Bots on a Mission, Humans Just Watching

Latest Articles

Why This 21-Year-Old Gave Up Fast Cash to Build the Future of AI

Sora 2 Unveiled—Is This OpenAI’s TikTok Killer?

They’re Not Lying—AI Progress Is Just Hard To See

GPT-5-Codex: The Complete Guide (Setup, Best Practices, and Why It Matters)

Vibe-Code Quest: How One Founder Built a Language-Learning Roguelike with Pure AI Magic

The State of AI Video in 2025: Veo 3, Runway Gen‑4, Midjourney Video, Pika, Luma & More