ChatGPT Cites Grokipedia: OpenAI Found a New Source of Truth
Сэм Альтман и Илон Маск снова в одной лодке, пусть и не по своей воле. В ответах ChatGPT стали замечать фрагменты из Grokipedia — консервативной ИИ-энциклопедии
AI-processed from TechCrunch; edited by Hamidun News
The world of artificial intelligence is tight-knit, but we didn't expect it to become so tight so quickly. Users of ChatGPT have discovered something strange: the neural network started pulling data from Grokipedia. If you missed this launch, Grokipedia is an ambitious project by Elon Musk's xAI, which positions itself as a knowledge base without censorship and without a left-wing agenda. The irony of the situation is off the charts: Sam Altman's company, which Musk regularly accuses of excessive progressivism and closedness, now relies on his own creation to shape its answers.
To understand how we got here, we need to recall the history of their conflict. Musk was at the origins of OpenAI, then left by loudly slamming the door, and launched his own xAI with an anti-woke agenda. Grokipedia became his answer to Wikipedia, which Elon considers too biased. It's a massive array of data generated and moderated by xAI algorithms. And now this data is surfacing in ChatGPT. Why is this happening right now? The answer is simple and simultaneously troubling: OpenAI desperately needs fresh data, and the company is not too choosy about the methods it uses to obtain it.
OpenAI's web crawlers comb through the internet around the clock. When xAI released Grokipedia to public access, it automatically became part of the public domain for search bots. It appears that Altman's engineers didn't set a filter on Musk's domains, or the ranking algorithms deemed Grokipedia's content sufficiently relevant. As a result, we got an amusing cocktail. ChatGPT, which usually tries to avoid controversial topics, suddenly outputs facts or interpretations characteristic of Grok. This isn't just a technical curiosity; it's a sign of an impending data crisis across the entire industry.
We've come close to a moment when quality human-generated content on the web is simply running out. Developers of large language models are beginning to consume content created by other models. In professional circles, this is called model degradation or digital incest. If ChatGPT learns from Grokipedia data, and Grok continues to learn from ChatGPT's answers, we risk creating a closed echo chamber. In such a system, errors and biases will multiply in geometric progression, gradually distancing AI from reality.
For Elon Musk, this situation looks like a two-fold victory. On the one hand, his knowledge base is recognized as significant enough to be cited by the market leader. On the other hand, OpenAI is using xAI's resource for free, one that xAI spent millions of dollars and enormous computational power on. This reminds us of the situations with Reddit and Twitter, which closed their APIs so that AI giants wouldn't steal their content for training. It's quite likely that Musk's next move will be closing Grokipedia from OpenAI's bots or another loud lawsuit about intellectual property infringement.
The problem of data borrowing between neural networks raises an important ethical question for us. If AI begins to cite another AI without a direct link to the original source, how will we be able to verify the accuracy of information? For now, we're observing how two major competitors in the industry involuntarily merge in a single information field. This could lead either to the creation of a universal superintelligence or to a complete collapse of meaning, where one neural network simply retells the hallucinations of another.
The main point: Will this case become a reason for a new copyright war between Musk and Altman, or have we officially entered an era where AI trains each other until losing all connection with the human original source?
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.