النماذج

Foundation Model

A foundation model is a large-scale AI system pre-trained on broad, diverse data—text, images, code—to serve as a general-purpose base adaptable to many downstream tasks. The term was introduced by Stanford HAI in 2021; examples include GPT-4, Claude, Gemini, and LLaMA.

Foundation models are AI systems trained on massive, heterogeneous datasets using self-supervised objectives such as next-token prediction or masked-token prediction. The term was coined in a 2021 paper by the Stanford Center for Research on Foundation Models (CRFM) to distinguish these general-purpose bases from earlier task-specific models. Unlike a model trained solely to detect spam or classify medical images, a foundation model acquires broad representations of language, knowledge, and reasoning that can then be adapted. Examples span language (GPT-4o, Claude 3.5, LLaMA 3), vision (CLIP, DINOv2), and multimodal systems (Gemini, GPT-4V).

Training a foundation model typically involves two phases. Pre-training on large corpora—often hundreds of billions to trillions of tokens of text, code, or image-caption pairs—builds general representations. Downstream adaptation then shapes the model for specific applications through fine-tuning, reinforcement learning from human feedback (RLHF), or prompt engineering, all of which are orders of magnitude cheaper than training from scratch. This 'train once, adapt many times' paradigm is the economic engine behind the current wave of AI products.

Foundation models shifted the economics of AI development by creating a shared substrate. Before them, each new application required a dedicated data pipeline and training run. With a foundation model, a small team can build a competitive product by fine-tuning or prompting a pre-existing base. This concentration of capability in a small number of base systems also concentrates risks, which has driven regulatory attention globally.

As of 2026, foundation models range from closed proprietary systems like GPT-4o and Claude 4 to openly licensed releases like Meta's LLaMA 3 (up to 405 billion parameters) and Mistral's model family. Multimodal foundation models that process text, images, audio, and video simultaneously have become standard rather than exceptional. The EU AI Act specifically classifies 'general-purpose AI models' trained above a compute threshold of 10²⁵ FLOPs as subject to transparency and evaluation requirements, making the foundation model concept legally operative across the European Union.

مثال

An enterprise builds a customer-service chatbot by fine-tuning a foundation model like LLaMA 3 on its own support ticket history, gaining domain knowledge without training a model from scratch.

مصطلحات مرتبطة

آخر الأخبار حول الموضوع

← المسرد