Models

Generative AI

Generative AI refers to machine learning systems that produce new content—text, images, audio, video, or code—by learning statistical patterns from large training datasets. Unlike discriminative models that classify existing data, generative models synthesize outputs that did not previously exist.

Generative AI systems learn to model the underlying distribution of training data and sample from that distribution to create new instances. The category includes large language models (LLMs) for text and code, diffusion models for images and video, and autoregressive models for audio synthesis. Modern systems are trained on datasets ranging from hundreds of billions to trillions of tokens or billions of image-text pairs, requiring compute clusters of thousands of accelerators running for weeks or months.

The dominant architectures are autoregressive transformers—GPT-series, LLaMA, Claude, Gemini—for text generation, and latent diffusion models—Stable Diffusion, DALL-E 3, Flux—for image synthesis. Text models are pretrained via next-token prediction and then aligned to human preferences through instruction fine-tuning and reinforcement learning from human feedback (RLHF) or direct preference optimization (DPO). Image models iteratively denoise samples from Gaussian noise guided by text embeddings, a process refined through contrastive language-image pretraining.

Generative AI automates and augments tasks previously requiring specialized human expertise: writing, coding, graphic design, music composition, and video production. A single capable model can simultaneously serve as a coding assistant, customer service agent, document summarizer, and data analyst. McKinsey's 2023 research estimated potential annual economic impact at $2.6–4.4 trillion across industries from productivity gains enabled by these systems.

As of 2026, leading text models include OpenAI's GPT-4o and o3, Anthropic's Claude 4-series, Google's Gemini 2.x, and open-source models like Meta's LLaMA 3. Video generation has matured with systems such as OpenAI Sora, Google Veo 2, and Kling producing multi-second photorealistic clips from text prompts. Multimodal models processing and generating across text, image, audio, and video simultaneously have become standard, and inference costs have dropped by roughly two orders of magnitude compared to 2023.

Example

A software company deploys a generative AI coding assistant fine-tuned on its internal codebase; developers use it to draft boilerplate functions, generate unit tests, and translate Python modules to TypeScript, reducing time spent on routine coding tasks by an estimated 30–40%.

Latest news on this topic

NVIDIA TensorRT now scales generative AI inference across multiple GPUs2026-06-29 Electronic Arts: generative AI helped developers unlock their creative potential2026-06-29 Generative AI Overloads Australian Labor Courts by 70% in Three Years2026-05-29 NASA's Creative Director on Branding in the Age of Generative AI2026-05-29 The illusion of mastery: how generative AI makes beginners look like experts2026-05-17

← Glossary

Generative AI

Example

Related terms

Latest news on this topic