TechCrunch Released a Guide to AI Jargon: What Are LLMs, Hallucinations, and RAG
Artificial intelligence has introduced a flood of new terminology: LLM, RAG, hallucinations, fine-tuning, tokens, agents. TechCrunch published a…
AI-processed from TechCrunch; edited by Hamidun News
The AI boom of 2024–2026 has brought not only new products and capabilities, but also an entire layer of professional jargon that is easy to get lost in. LLM, RAG, hallucinations, tokens, fine-tuning, prompts — these words increasingly appear in news, presentations, and business negotiations, yet most people's understanding remains vague. TechCrunch published a detailed glossary explaining key concepts of the AI era — from basic architecture to applied model-working techniques.
At the foundation of most modern AI systems are Large Language Models — LLM (Large Language Model). These are neural networks trained on massive volumes of text. They don't "understand" language in the human sense, but they are able to generate statistically plausible answers to any queries.
The basic unit that an LLM works with is a token: roughly three to four characters, part of a word, or a punctuation mark. GPT-4o processes up to 128,000 tokens at a time — roughly 300 pages of text. The larger the context window, the more information the model can take into account when formulating a response.
One of the main shortcomings of LLMs is hallucinations. This refers to situations where the model confidently outputs factually incorrect information: invented quotes, non-existent sources, false dates. This is not a "lie" in the ethical sense — the model simply generates plausible-sounding text without having a built-in fact-checking mechanism.
To combat hallucinations, the RAG method was developed (Retrieval-Augmented Generation): before generating an answer, the system searches for relevant fragments in a real database and relies on them. Many corporate AI assistants and next-generation search systems work on this principle today. When a base model needs specialization, it is further trained on narrow data.
This process is called fine-tuning: the model learns to respond in the desired style, on specialized topics, or within a specific format. A more accessible approach is prompt engineering: formulating requests intelligently to achieve the desired result without retraining the model. A separate and rapidly growing class of systems are AI agents: they don't just answer questions, but plan and execute chains of actions — searching for information on the internet, running code, managing files and browsers.
The most well-known examples are Claude Computer Use and OpenAI Operator. Among other key terms: parameters — numerical "weights" of a neural network that determine its behavior (GPT-4's estimated volume exceeds one trillion parameters); inference — the process of obtaining an answer from a trained model in real time, this is what determines the speed and cost of AI services; embeddings — numerical representations of words and texts that allow measuring semantic closeness of concepts. Multimodality means the ability of a model to work simultaneously with multiple types of data: text, images, audio, and video.
Understanding basic AI vocabulary is no longer the domain of developers alone — it has become a necessity for managers, investors, journalists, and everyone working with this technology. The jargon continues to expand: each new class of models brings its own terms — multi-agent systems, synthetic data, streaming inference. But by mastering the core — LLMs, tokens, hallucinations, RAG, fine-tuning, and agents — you can confidently navigate most publications and conversations about AI.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.