Fine-tuning
Fine-tuning is the process of further training a pre-trained AI model on a smaller, task-specific dataset so it performs better on that task. Instead of building a model from scratch, you adapt an existing one to your domain, style or output format.
Fine-tuning takes a foundation model that already understands language and continues its training on a narrow dataset — typically hundreds to tens of thousands of examples of the inputs and outputs you care about. The model's weights shift toward your task: legal drafting, medical coding, a brand's tone of voice, or a strict output format like JSON.
It is best used when you need consistent behavior, not fresh knowledge. Teaching a model new facts via fine-tuning is expensive and unreliable — facts change and the model still hallucinates. Teaching it a style, a format, or a decision policy works well, because those patterns repeat across the training examples.
In practice teams compare three options: prompt engineering (cheapest, no training), RAG (fresh and private knowledge), and fine-tuning (stable behavior at lower per-request cost, since long instructions move from the prompt into the weights). Parameter-efficient methods like LoRA make fine-tuning feasible even on a single GPU.