AlphaGo's creator founded a unicorn company to build AI super-learners

David Silver, the scientist whose AlphaGo system became the first ever to defeat a world Go champion in 2016, has founded a new company valued at a billion dollars. The goal: create "super-learners"—AI that learns from its own experience rather than data created by humans. According to Silver, the entire industry is on the wrong path, scaling language models instead of pursuing reinforcement learning.

Khamidun Zhemal

AI monitoring · Wired

Apr 27, 2026· 3 min

AI-processed from Wired; edited by Hamidun News

AlphaGo's creator founded a unicorn company to build AI super-learners — Source: Wired. Collage: Hamidun News.

◐ Listen to article

David Silver, the scientist who created the AlphaGo algorithm that in 2016 became the first in history to defeat the world Go champion, has announced the founding of a new company valued at approximately one billion dollars. Its goal is to build what Silver calls super-learners: AI systems capable of independently mastering complex domains of knowledge without relying on datasets created by humans. This is a direct challenge to the industry's dominant paradigm, in which all major players are betting on scaling language models.

Silver is one of the key architects of modern AI, and his biography speaks for itself. His work at Google DeepMind led to AlphaGo, and later AlphaZero—an algorithm that from scratch mastered chess, shogi, and Go, never having seen a single human game. Instead of learning from prepared examples, the system independently generated and analyzed millions of positions, discovering strategies that professional players described as inhuman.

It is this experience that shapes his conviction about what next-generation AI should be.

Silver's central idea is both simple and radical at once: large language models—ChatGPT, Claude, Gemini, and others—are fundamentally limited by learning exclusively from texts and data produced by humans. This creates an insurmountable ceiling: AI cannot surpass the cognitive abilities of its creators if it feeds only on their knowledge and their misconceptions. Simply increasing the number of parameters and the volume of training data, he argues, does not solve this fundamental problem—it only scales it.

The alternative is reinforcement learning (RL). Unlike supervised learning, where a model learns to reproduce correct answers from a pre-labeled dataset, RL allows an agent to independently explore the space of possibilities: try actions, receive reward signals, and gradually build a strategy. This is precisely how AlphaGo worked—and this approach, Silver is convinced, opens the path to AI that surpasses humans across a broad spectrum of tasks, not just in pre-agreed games.

This position has serious arguments in its favor. OpenAI is partially moving in this direction with its reasoning models in the o series, which use RL elements for self-checking answers. Google DeepMind continues fundamental research in this area.

Nevertheless, the bulk of industry resources remain concentrated on scaling language models, and it is precisely against this mainstream that Silver takes an openly contrarian stance. The main difficulty with RL lies beyond narrow, clearly defined tasks. For chess, it is simple to set the reward function: win and you get a plus.

For writing convincing text, making a well-considered business decision, or conducting original scientific research, the reward function is not obvious. It is precisely this problem of ineffable intelligence that the new company must solve. The one-billion-dollar valuation without a single product on the market speaks to the weight of the founder's reputation.

In the current investment climate, when every AI startup claims historical significance, the name of AlphaGo's creator is simultaneously a ready-made proof of concept and insurance for investors unwilling to wait years.

If Silver is right, the next phase of the AI race will look fundamentally different: less human data, more autonomous self-learning, less imitation—more discovery. Systems capable of independently forming knowledge beyond what humanity knows—that is his vision of super-learners. Whether the idea will materialize into an actual product, time will tell. But the fact that one of the chief architects of modern AI is making a public bet against the dominant paradigm is itself a significant signal for the entire industry.

Hamidun News

AI news without noise. Daily editorial selection from 50+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Need AI working inside your business — not just in your newsfeed?

I build production AI for companies — custom CRM, internal tools, autonomous agents, workflow automation. Owned by you, shaped to your process, no per-seat tax. Built by Zhemal Khamidun, CPO of AlpinaGPT (AI platform, 6,000+ users).

Book a free consultation →

AlphaGo's creator founded a unicorn company to build AI super-learners

Need AI working inside your business — not just in your newsfeed?

The AI world, distilled — once a week