Transformer Isn't Needed Anymore: Ex-VP of OpenAI Builds New Empire for a Billion
Очередной громкий уход из OpenAI перерос в амбициозный стартап. Бывший вице-президент по исследованиям (VP of Research) уверен: чтобы достичь AGI, нужно выброси
AI-processed from Jiqizhixin (机器之心); edited by Hamidun News
Seems like the hallways of OpenAI are getting emptier. The exodus of key employees from Sam Altman's company has transformed from an unfortunate brain drain into a full-fledged formation of a new industry. This time, headlines have been stirred by a former Vice President of Research, who didn't just leave "into the void," but announced the creation of a startup with ambitions for a billion dollars.
And his main goal sounds almost sacrilegious to the modern AI community: he plans to challenge the Transformer architecture, which is the foundation of everything we call modern artificial intelligence. Let's be honest: Transformer, gifted to the world by Google researchers in 2017, became the gold standard. All these GPTs, Claudes, and Geminis are essentially just variations on one and the same idea.
But this architecture has fundamental problems with memory scaling and computational efficiency. The longer the context, the harder neural networks "breathe." Former OpenAI leaders, who stood at the origins of training the most powerful models, understand perfectly well that endlessly increasing the number of GPUs is a dead end.
To achieve true artificial general intelligence, you need something more elegant and efficient than the crude "attention" that the current tech stack relies on. The sum of 1 billion dollars that the new project plans to attract is not just a nice number for headlines. This is the price of entry into the big leagues.
In a world where training one model costs hundreds of millions, attempting to create an alternative architecture requires enormous resources for experiments with hardware and data. We've already seen attempts to implement State Space Models (SSM) or architectures like Mamba, but none of them have managed to dethrone the king. The fact that the man responsible for "post-training" at OpenAI is taking on this challenge suggests he has a concrete understanding of exactly where the old architecture begins to break down.
This departure is part of a broader trend. We're observing how the "OpenAI mafia" spreads across Silicon Valley, creating competition that Microsoft and Google could only dream of. Anthropic was the first signal, SSI (Safe Superintelligence) from Ilya Sutskever the second.
Now we see the third wave: those who want to change not just the training methods or safety issues, but the very mathematical foundation of neural networks. If this startup can prove that its approach works better over long distances, OpenAI will find itself in the position of a company that invested billions in perfecting the steam engine at the moment when the internal combustion engine appeared. Investors seem ready to take the risk.
In the valley right now there's a strange mix of euphoria and fear of missing "the next big thing." Everyone understands that the current success of LLMs could be a local maximum. And while Sam Altman is busy turning OpenAI into a commercial corporation and searching for trillions for chips, his former engineers are trying to reinvent the wheel itself.
This is a classic David and Goliath story, except David has a billion dollars of venture capital in his pocket and the best industry experience. What does this mean for us? Most likely, we're on the verge of a paradigm shift.
If the new architecture turns out to be more efficient, AI will become not only smarter, but also cheaper, more accessible, and possibly more autonomous. It's time we get used to the idea that the acronym GPT could become as much of an anachronism as Netscape or AltaVista. In the world of AI, six months is an era, and a year is an eternity.
And this eternity seems to belong to those who dared to press the "delete" button on the Transformer code. The key question: Will the new architecture be able to scale as predictably as Transformer, or will we see another "bubble" of ambitions that bursts against the harsh reality of distributed computing?
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.