✅ Claude 3.7 Wins – Anthropic’s latest AI outperformed rivals in maneuvering Mario.
✅ GPT-4o & Gemini Struggle – OpenAI and Google’s models had difficulty adapting to real-time gameplay.
✅ Slow Decision-Making Hurts – Reasoning AIs, designed to “think” through problems, performed worse due to slow reaction times.
🎮 New AI Testing Ground – Video games provide a controlled but complex environment to evaluate AI decision-making.
⏳ Real-Time Challenges – Models built for deep reasoning struggle in dynamic, fast-paced settings.
🧠 Beyond Gaming? – Researchers debate whether gaming skills truly reflect AI’s broader problem-solving abilities.
As AI continues to evolve, benchmarks like Super Mario Bros. highlight critical gaps in real-world adaptability. Could this reshape how we measure AI intelligence?