Habr AI→ original

A neural network as a time machine: why LLMs are taught to think the old-fashioned way

Researchers have found a paradoxical way to use LLMs: instead of expanding training data, they restrict it, creating models that “think” like people from past e

AI-processed from Habr AI; edited by Hamidun News
A neural network as a time machine: why LLMs are taught to think the old-fashioned way
Source: Habr AI. Collage: Hamidun News.
◐ Listen to article

Language models are conventionally evaluated by the volume of knowledge they possess: the more data a neural network has absorbed, the more intelligent it becomes. But a group of researchers and enthusiasts turned this logic on its head. They deliberately reduce training datasets, limiting them to texts from a specific historical era, and obtain something completely unexpected: artificial intelligence that reasons as if it lives in the 17th or early 20th century.

At first glance, the idea seems like an exotic whim. Why would anyone need a model that knows nothing about antibiotics, the theory of relativity, or the internet? However, there is serious scientific motivation behind it. Modern LLMs are trained on text corpora spanning the entire history of written language up to the present day. They inevitably view the past through the lens of the present—with its terminology, values, and accumulated knowledge. A model trained only on texts from before 1912 lacks this retrospective lens. It does not simply reproduce the words of an era—it reproduces its way of thinking, its blind spots, its confidence in things we long ago considered delusions.

Technically, the approach looks as follows. The architecture of a standard language model is taken — typically relatively compact, since the volume of historical texts is limited. The training corpus is formed exclusively from sources dated to a specific period: books, newspapers, letters, scientific treatises, legal documents. It is critically important to exclude any texts written after the chosen cutoff date. As a result, the model absorbs not only the vocabulary and grammar of the era, but also its epistemological framework — that is, the boundaries of what people of that time considered possible, true, and permissible.

The application of such 'temporal' models turns out to be far broader than one might assume. In the field of epistemology—the science of knowledge—they allow researchers to investigate how the very mechanisms of knowledge formation changed. You can ask a 1650-era model a question about the nature of diseases and receive an answer based on humoral theory—not as stylization, but as a genuine conviction of a system for which germ theory simply does not exist. This provides scientists with a unique tool for modeling historical paradigms of thought.

In behavioral sciences, such models help study how cultural and informational context shapes behavior and decisions. If you place an LLM within the knowledge framework of a specific era, you can model reactions to events, economic decisions, social attitudes—and compare them with actual historical data. In essence, this is a form of computational historical psychology that would have been unthinkable just a few years ago.

The educational potential is also impressive. Imagine an interactive dialogue with a 'scholar' from the Age of Enlightenment who not only cites 18th-century texts but consistently reasons within the framework of that era's worldview. A student can ask questions, argue, encounter logic that was flawless for its time but looks absurd today. This is a powerful way to demonstrate that knowledge is not an absolute value, but a historically conditioned process.

Several open initiatives are already working in this direction. Projects discussed by Beeline Cloud specialists develop both the models themselves and the methodology for preparing historical corpora. The key challenge here is data quality. Digitized texts from past centuries often contain recognition errors, and source selection requires serious expertise from historians to ensure the corpus adequately represents the thinking of the era, not just its literary elite.

And there is a fundamental question that this approach raises. If a model trained on texts from the past reproduces the delusions and prejudices of its time, what does this say about modern LLMs? They are equally constrained by the boundaries of our era—we simply don't yet know which of our 'obvious truths' future generations will find naive. Temporal models become a mirror that reminds us: any intelligence, artificial or otherwise, is a product of its time. And awareness of this fact may be more valuable than any technological breakthrough.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…