Latest AI news, expert analysis, bold opinions, and key trends — delivered to your inbox.
A new analysis from Epoch AI, a nonprofit research institute, suggests that the rapid gains seen in reasoning AI models—such as OpenAI’s o3—may soon hit scaling limits. The report forecasts that within a year, the pace of improvement in models that handle complex reasoning tasks like math and programming could begin to slow significantly.
These models, which outperform conventional systems on logic-heavy benchmarks, rely heavily on a two-phase training pipeline: large-scale supervised learning followed by reinforcement learning (RL), which helps the model refine solutions to harder problems.
While most AI labs have only recently started increasing the compute used in the RL phase, OpenAI has taken a major leap—reportedly using 10× more compute to train o3 compared to its predecessor o1, with much of that focused on reinforcement learning. According to OpenAI researcher Dan Roberts, future iterations could see more compute allocated to RL than even the base training phase.
However, Epoch warns of a ceiling: there's only so much performance that additional compute can unlock. The institute notes that while standard model training yields performance gains that quadruple yearly, reinforcement learning’s contribution is currently growing 10× every 3–5 months—a rate unlikely to continue for long.
Josh You, the Epoch analyst behind the report, predicts this explosive growth will taper off by 2026, when performance from RL will likely “converge with the overall frontier.”
Beyond compute, overhead costs—in terms of time, talent, and experimentation—pose additional challenges to scaling reasoning models. If these costs persist, it could limit how far labs can push this approach.
The stakes are high: major AI companies have poured billions into developing reasoning-capable models. But alongside their potential, they carry flaws—such as slower inference speeds and a higher tendency to hallucinate compared to traditional LLMs.
The future of reasoning AI may depend not just on compute, but on how efficiently reinforcement learning can scale—and whether new techniques can overcome the looming plateau. If Epoch’s projections are correct, the AI race may be headed toward a bottleneck that demands new strategies beyond brute-force training.