Stay Ahead of the Curve

Latest AI news, expert analysis, bold opinions, and key trends — delivered to your inbox.

xAI Unleashes Grok 4 and Grok 4 Heavy — “Smarter Than a PhD,” Now with Voice, Vision, and SOTA Across the Board

3 min read xAI just launched Grok 4 and Grok 4 Heavy—reasoning-first AI models claiming to outthink PhDs and beat OpenAI’s o3 and Gemini 2.5 on top benchmarks. Powered by Colossus, priced up to \$300/month. But after Grok 3's backlash, trust is the real test. July 10, 2025 15:09 xAI Unleashes Grok 4 and Grok 4 Heavy — “Smarter Than a PhD,” Now with Voice, Vision, and SOTA Across the Board


Elon Musk’s xAI just dropped Grok 4 and Grok 4 Heavy, its next-gen AI models designed for pure reasoning power—and they’re already boasting state-of-the-art (SOTA) performance on elite benchmarks like Arc-AGI and Humanity’s Last Exam.

What’s new?

  • Grok 4 is a single-agent model equipped with voice, vision, and a massive 128K context window.

  • Grok 4 Heavy takes it a step further, using multi-agent collaboration to handle more complex reasoning and tasks.

The benchmarks tell the story

Both models deliver SOTA results on:

  • Humanity’s Last Exam

  • Arc-AGI-2

  • AIME (math-focused benchmark)

They also outperform Gemini 2.5 Pro and OpenAI’s o3, putting xAI in direct competition with the biggest players in the space.

Pricing and access

  • Grok 4 is part of the SuperGrok plan at $30/month.

  • Grok 4 Heavy is bundled with the SuperGrok Heavy plan, priced at $300/month.

  • For developers, the API includes a 256K-token context window and built-in search, priced at $3/million input tokens and $15/million output tokens.

The context: Redemption after Grok 3?

This launch follows a wave of criticism after Grok 3 made headlines for racist and antisemitic outputs post-update. xAI claims this release is built on tighter alignment, but it’s clear the company will face intense scrutiny going forward.

Why it matters

xAI might be new, but it’s not playing small. Grok 4 and 4 Heavy showcase the raw power of Musk’s Colossus supercomputer and raise the bar for scaling frontier LLMs. But with controversy still fresh, the bigger challenge might not be benchmarks—it’s trust.


User Comments (0)

Add Comment
We'll never share your email with anyone else.

img