Fine-tuning LLMs with PEFT and LoRA

118,489
0
Published 2023-04-24
LoRA Colab : colab.research.google.com/drive/14xo6sj4dARk8lXZbO…
Blog Post: huggingface.co/blog/peft
LoRa Paper: arxiv.org/abs/2106.09685

In this video I look at how to use PEFT to fine tune any decoder style GPT model. This goes through the basics LoRa fine-tuning and how to upload it to HuggingFace Hub.

My Links:
Twitter - twitter.com/Sam_Witteveen
Linkedin - www.linkedin.com/in/samwitteveen/

Github:
github.com/samwit/langchain-tutorials
github.com/samwit/llm-tutorials

00:00 Intro
00:04 - Problems with fine-tuning
00:48 - Introducing PEFT
01:11 - PEFT other cool techniques
01:51 - LoRA Diagram
03:25 - Hugging Face PEFT Library
04:06 - Code Walkthrough

All Comments (21)
  • You continue to make videos on exactly the things I'm trying to understand more deeply! Fantastic! There are a lot of detailed parameters in this video that you could certainly continue to elaborate on for those of us who aren't programmers...yet :) Looking forward to more of your vids!
  • @redfield126
    Perfect balance of theory and hands-on with a colab attached to most of your videos. Much Much apreciated. I recommend this channel to all people who wants to follow this crazy trend of LLM releases. the best path to keep all of us up to date! I learn so much thanks to you Sam. Thanks a ton. Keep moving forward.
  • This is great. Not so many channels on YT that do this kind of stuff. Would appreciate more like this, other frameworks like deepspeed, useful datasets, training parameters experiments, etc. so many interesting stuff that is not covered on YT.
  • @nacs
    Many have said it but I'll reiterate -- your LLM videos are really great to watch, both the pace and the way you go from high level overviews to the detailed info. I also appreciate that it's not just focused on ChatGPT/GPT-4/hosted-models all the time and talks more about local training/finetuning/inferencing.
  • @victarion1571
    Sam, thanks for giving your audience their requests! The alpaca training video you made makes much more sense now
  • @briancase6180
    So this seems like the basis for a business: offer to train a custom model for product documentation, FAQ, etc with a specific product or company focus. Cool!
  • @PattersML
    Awesome explanation, this is exactly what I was looking for. Thank you!
  • @notanape5415
    Thanks for the awesome explanation. Going to binge your videos.
  • @kaiman99919
    Thank you! Be great to see more on the data section - everyone always seems to gloss over that part, despite the fact that is clearly the most important part. Seen a lot of (from diff youtubers) 20-40 min vids on the configuration, barely mentioning the actual use of the data?
  • @saracen9
    Awesome stuff Sam. I’m in the process of using langchain to build a vector store and - whilst it’s fine for now - would be really interested in understanding the best way to then take this and use to generate a LORA. Feels like the logical next step.
  • @coolmcdude
    i would love to see more videos about this and showing people how we could adapt this to our own projects and maybe even a video about 4bit tuning.
  • I would love a vid covering examples of the differently formatted types of datasets that can be used to train a lora and the types of abilities that the different kinds of dataset training will allow - or put another way - what kinds of behavioral changes in abilities can we use lora to fine-tune for in a model, and how do we then know what types of data formatting to use in order to get a chosen outcome. :D
  • Can you create more videos on instruction-prompt-tuning as well, as a further extension to this video? Amazing work!
  • I’d love a quick video like this on how to use checkpoints from PEFT training to do inference. When I’m training, I’m never sure how much is too much, and I can save checkpoints automatically easily to resume in case training stops. What I need to learn is how to use these checkpoints with the base model to do inference so I can test output quality against several checkpoints. Ideally I’d like to be able to do inference on a base model plus checkpoint, and then once I find a good result, merge the checkpoint into the base model so I can use it in production and keep VRAM low. (I am assuming inference on base model + checkpoint will use more vram)
  • These fine-tuning-related topics are especially relevant to me right now. Currently training llama-30b variants at 4-bit. I’m very interested in how to roll adapters/checkpoints back into base models to keep VRAM usage down during inference (under 24GB)
  • Great quick tutorial. This is good for English-only pretraining/fine-tuning. What is about non-English ? What are steps should we take to (1) extend vocab (2) pretraining (with or without LoRA) free-non-structure-text corpus (3) fine-tune with LoRA for each task ... ! Would love to have your tutorial on this road, it would be great. Thanks, Steve.