Scaling Law on Autopilot: AI Started Teaching People How to Build Neural Networks
Исследователи из Пекинского и Стэнфордского университетов автоматизировали самую дорогую часть разработки нейросетей — поиск законов масштабирования (Scaling La
AI-processed from Jiqizhixin (机器之心); edited by Hamidun News
Imagine you're building a skyscraper but don't know for sure whether the foundation can support ten more floors. In the world of large language models, everything works roughly the same way. Engineers spend hundreds of millions of dollars on training, hoping that adding another few thousand graphics cards will make the model smarter, not just more expensive. These unspoken rules of the game are called Scaling Laws, and until recently, finding them resembled modern-day alchemy. But it seems the era of reading tea leaves is coming to an end, because researchers from Peking University and Stanford have decided to hand over this tedious and expensive work to the neural network itself.
The problem is that finding these laws is a grueling and prohibitively expensive process. Remember the famous DeepMind work on the "Chinchilla model" (Chinchilla scaling laws). Back then, researchers had to train dozens of small models, collect data on their performance, and try to derive a formula that would predict the behavior of the "big brother." An error in calculations at this stage costs not just time—it costs a fortune.
The new project, given the working title "AI Scientist," fundamentally changes the rules of the game. Instead of forcing people to manually select coefficients and build graphs, scientists created a system that analyzes trial run results and formulates mathematical dependencies on its own. What's most ironic here is that this virtual scientist performed the task better than living experts. During tests, the system predicted model accuracy with a margin of error that turned out to be significantly lower than that of experienced data scientists.
This isn't simply a matter of speed or convenience. We're used to thinking that scientific discovery and intuition are humanity's last bastions, but it turned out that in finding hidden patterns within massive datasets, our brain is too prone to oversimplification. AI doesn't search for "beautiful" numbers or simple linear graphs; it finds the dependencies that actually work in the multidimensional space of parameters.
Why is this important right now? We've reached a point where simply adding computational power no longer yields explosive growth in quality. The industry increasingly whispers about a "plateau," and to move forward, we need not just teraflops but surgical precision in architecture. If previously OpenAI or Google could afford to burn electricity on entire cities for the sake of an experiment, now investors demand efficiency.
Automating the search for Scaling Laws is essentially creating a navigator for those who previously walked by instruments in thick fog. Now we can know in advance whether it's worth feeding the model another trillion tokens or if it has already reached its limit.
What does this mean for the future of the industry? We'll likely see a sharp acceleration of development cycles. If before, it took months to verify a fundamental hypothesis, now an automated system can run thousands of scenarios in just hours. This brings us closer to the moment when neural networks begin designing the next generations of themselves with virtually no human involvement. We still have our hand on the circuit breaker, but someone else is drawing the blueprints. And that "someone" clearly understands the mathematics of learning better than we do.
Bottom line: AI has finally stopped being just a "smart chatbot" and has become a tool for fundamental scientific discoveries. If neural networks have learned to optimize their own training better than their creators, then the question of the emergence of full-fledged AGI becomes merely a question of time and the right formula—one that will probably be found not by a human.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.