OpenAI has announced that one of its experimental reasoning models has achieved gold medal-level performance at the 2025 International Mathematical Olympiad (IMO). The feat marks a historic milestone in AI’s ability to perform sustained, high-level mathematical reasoning. While the model will not be released to the public yet, it signals rapid advances in general-purpose AI research. Meanwhile, OpenAI has confirmed that GPT-5 is scheduled for release soon, as training for GPT-6 is already underway.
OpenAI’s Experimental Model Solves 5 of 6 IMO Problems
OpenAI researcher Alexander Wei revealed on X that their latest experimental large language model successfully tackled the IMO under real competition conditions. According to Wei, the AI model solved five out of six problems from the Olympiad, earning 35 out of 42 points—an achievement comparable to the top human contestants globally. The model was evaluated during two 4.5-hour sessions, adhering to the same constraints applied to human participants, including no internet access and no external tools.
We achieved gold medal-level performance 🥇on the 2025 International Mathematical Olympiad with a general-purpose reasoning LLM!
— OpenAI (@OpenAI) July 19, 2025
Our model solved world-class math problems—at the level of top human contestants. A major milestone for AI and mathematics. https://t.co/u2RlFFavyT
Each of the model’s detailed proofs was assessed independently by three former IMO medalists. Final scores were awarded only when all three graders reached a unanimous agreement. The results, Wei noted, represent a major breakthrough for AI systems in “sustained creative thinking,” a benchmark previously out of reach for language models.Wei contextualized the achievement by comparing it with prior milestones in mathematical reasoning benchmarks. He referenced the progression from solving elementary problems in GSM8K (requiring under a minute for top humans) to the more challenging MATH benchmark, then to AIME problems, and now to IMO-level tasks which demand over 100 minutes of focused thought.
He described this progression as a reflection of major improvements in reinforcement learning and test-time compute scaling.Despite the breakthrough, OpenAI confirmed that the model responsible for the IMO performance is not being released in the near term. Wei clarified, “The IMO gold LLM is an experimental research model. We don’t plan to release anything with this level of math capability for several months.”
GPT-5 Launch Imminent, GPT-6 Training Underway
While the IMO-capable system remains in the lab, OpenAI is preparing to roll out GPT-5. According to Wei, GPT-5 is on track for release and forms part of a separate product development track from the experimental reasoning model. He stated, “We are releasing GPT-5 soon, and we’re excited for you to try it.”
Additional confirmation came from Yuchen Jin, co-founder of Hyperbolic Labs, who also took to X to hint that the GPT-5 launch may be imminent. Jin explained that GPT-5 will differ from its predecessors by operating as a system of specialized models, dynamically managed by a central router. This architecture would allow the system to automatically direct user prompts to sub-models optimized for tasks such as reasoning, general language use, or tool interaction.
Jin further noted that this multi-model framework likely explains recent remarks from OpenAI CEO Sam Altman about the need to “fix model naming.” Under the new architecture, users would no longer have to select between different model versions, as the routing system would handle it in the background based on the prompt’s nature.
In a related post, Jin also revealed that GPT-6 is already undergoing training. He suggested that the timeline for its release might be influenced by safety evaluations, remarking, “I just hope they’re not delaying it for more safety tests.”
A Milestone in AI Reasoning and Long-Term Progress
The victory in the IMO highlights how fast the development in the field of AI is gaining speed, especially in areas such as mathematics, which are quite challenging. Reflecting on his prediction submitted in 2021, with the help of his PhD advisor, Jacob Steinhardt, Wei considered the possibility that the conclusions provided could have changed. At the time, Wei forecasted an achievement of 30 per cent performance on the MATH benchmark by the month of July 2025. The present outcome, a gold medal in the IMO, is much more than that.He gave credit to other researchers whom he deemed contributed to the development of the model, namely Sheryl Hsu, Noam Brown, among others. Wei put much stress on the achievement, saying that it requires considerable advances in the reinforcement learning methods and computational optimization. This field has been followed by the larger AI community.
A year earlier, Google DeepMinds AlphaProof and AlphaGeometry 2 had shown similar potential by solving four problems (out of 6) of the IMO, getting what would have been a silver medal.The fact that the model of OpenAI was able to exceed this performance indicates that a significant difference in the performance of AI and high-level human performance at abstract thought is shrinking.The IMO model will technically not be included in the release of GPT-5, but it seems that OpenAI is willing to further explore the boundaries of reasoning abilities.
The fact that the company did not immediately release the model highlights its conservative approach to implementing anything that has a strong reasoning system before it is put to the test. The world is anticipating GPT-5 and observing GPT-6 being trained, the latest discovery of OpenAI is the beginning of a new era of AI progression, which is characterized by more not only linguistically fluent, but mathematically insightful machines in rigidly constrained computational work.