LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.

Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.

Start free trial Sign in

From the course: LLaMa for Developers

Unlock the full course today

Join today to access over 24,700 courses taught by industry experts.

Fine-tuning LLaMA and freezing layers

Fine-tuning LLaMA and freezing layers - Llama Tutorial

From the course: LLaMa for Developers

Start my 1-month free trial Buy for my team

Fine-tuning LLaMA and freezing layers

“

- [Instructor] So it turns out fine tuning LLaMA isn't as easy as it seems. And that's okay. That's the fun of working with AI. In this video, we're going to cover the techniques we need in order to enable our fine tuning. Let's go into the second cell. Let's talk a little bit about how much memory we need for training our model. So there are three key things that affect our model memory for training. The first one is loading the model parameters onto the GPU, which takes up four bytes per model parameter. The second one is the optimizer, which uses eight bytes per model parameters. And the third one is storing the gradients, which requires four bytes per model parameter. Now if you add all these items up, we get 16 bytes per model parameter. Now multiplying that by seven billion parameters, we require 112 gigabytes of RAM. That's a lot. And we can reduce this by doing three techniques. The first one is freezing…

Contents