MIT Opens MathNet — World's Largest Collection of Math Olympiad Problems
MIT opens MathNet — the world's largest open collection of math olympiad problems. It includes over 30,000 problems and solutions from 47 countries, 17…
AI-processed from MIT News; edited by Hamidun News
MIT has opened MathNet — the largest open collection of mathematical Olympiad problems to date, which is simultaneously needed by AI researchers and schoolchildren preparing for competitions. The database includes over 30,000 problems and detailed solutions from national mathematics Olympiads in 47 countries. For the industry, this is a more rigorous test of mathematical reasoning than conventional English-language benchmarks.
For students — a unified library of high-quality problems that were previously scattered across paper collections, forums, and personal archives. The project was created by researchers from MIT CSAIL, KAUST, and the company HUMAIN. According to the team, MathNet covers 17 languages, 143 competitions, and approximately four decades of Olympiad mathematics.
The authors had to collect 1,595 PDF volumes with a total of over 25,000 pages: from modern digital documents to old scans that had existed only in personal collections for years. A substantial part of the archive came from a private collection of one of the co-authors, who had manually scanned Olympiad compilations since 2006. The resulting dataset, according to MIT, is approximately five times larger than the nearest analogue, has already been made publicly available and will be presented at the ICLR 2026 conference in Brazil.
The key difference of MathNet is not only in scale but also in the quality of sources. While many existing mathematical datasets were collected from forums like Art of Problem Solving, here problems are taken only from official national compilations. This is important because solutions in such materials are usually written by experts, they undergo verification and often explore several different solution methods for one problem.
Additionally, the collection is much broader geographically: it covers six continents, includes text and visual problems, and is not limited to English-language and Chinese traditions. For additional validation, the team assembled a group of more than 30 reviewers from different countries who jointly rechecked thousands of solutions. For researchers, this is an opportunity to train models on more diverse mathematical culture, rather than on a narrow set of familiar formulations.
As a benchmark for AI, MathNet gives rather uncomfortable results even for strong models. On the main set of 6,400 problems, GPT-5 showed approximately 69.3 percent, meaning it failed on nearly every third Olympiad-level problem.
When the problem contains illustrations, model results drop even more noticeably, pointing to persistent weakness in visual reasoning. The team also tested how models work with less common languages: several open-source systems scored 0 percent on problems in Mongolian. Separately, the researchers added a retrieval benchmark, where one needs to recognize structural similarity between two problems.
Even the best embedding models found the correct match on the first try in only about 5 percent of cases. This is important not only for AI but for Olympiads themselves: problems similar in essence have already appeared in actual exams, and tracking mathematical equivalents across different languages, notations, and formats is extremely difficult even for experts. Another test showed that retrieval-augmented generation indeed helps, but only if the suggested problem is truly close in structure: for DeepSeek-V3.
2-Speciale the improvement reached up to 12 percentage points, while irrelevant hints worsened the result in approximately 22 percent of cases. The practical significance of MathNet extends beyond academic AI. For schoolchildren and teachers, this is a rare case where high-quality Olympiad materials from dozens of countries are collected in one place and brought to a unified format.
For model developers, this is a reminder that bold claims about "nearly solved" mathematics are still premature: as soon as problems become truly international, multimodal, and less standardized, the quality gap is still clearly visible. This is why MathNet could become one of the most useful tests for real mathematical thinking of models in the coming years and simultaneously one of the most valuable open libraries for preparing for mathematical Olympiads.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.