What is fine-tuning for LLMs?
The fine-tuning of a ML model is when you take a base model with frozen weights, add some new layers on top of the frozen layers, and train the new layers (fine-tune the model as a whole) using your own training data that is specific to the task you want to use your model for.
The fine-tuning of large language models (LLMs) is becoming increasingly impractical due to their rapidly-growing size. Larger LLMs need to be trained in parallel on a single GPU and the cost of fine-tuning.That said, there are a large number of open-source LLMs that can be downloaded and fine-tuned. Many people have downloaded Facebook’s LLaMA model and fine-tuned it for tasks, showing the potential for LLMs that can be fine-tuned on a single large GPU.
Supervised Fine-Tuning (SFT)
This approach instructs the model to be useful for specific questions, rather than the entire dialogue. This is the most common type of fine-tuning. In particular, when the best open-source foundation models are too large to fit on a single GPU, such as Llama-2, it has become popular to use Parameter-Efficient Fine-Tuning (PEFT) methods, such as LoRA (low-rank adaptation of large language models) is a SFT method.
Reinforcement Learning Human Feedback (RLHF)
It requires high quality dialog data. Pre-training for Completion. This approach requires significant amounts of available compute.