From the course: LLaMa for Developers

Unlock the full course today

Join today to access over 24,900 courses taught by industry experts.

Fine-tuning larger LLaMA models

Fine-tuning larger LLaMA models - Llama Tutorial

From the course: LLaMa for Developers

Fine-tuning larger LLaMA models

- [Instructor] In this video, we're going to cover how to fine-tune larger LLaMA models. Right now we're here on the QLoRA GitHub page. One of the limitations of QLoRA is that you can only fine-tune a 65-billion parameter model, which is original LLaMA, on a 48-gigabyte GPU. Now, the challenge here is our A100 only goes up to 40 gigabytes. So unfortunately, we need to find another method. We can always use multiple GPUs, but in this case, we don't have it available on Colab. So for this video, we're going to go through a more theoretical approach on how to fine-tune a larger model. QLoRA and freezing some layers are great ways to fine-tune larger LLaMA models. The alternative is we can use a third-party vendor. For example, Together.AI lets you fine-tune a 70-billion parameter model. As you can see here, they have a pricing calculator on their website to estimate the cost. So for a 70-billion parameter model and…

Contents