Fine tuning my first open LLM model

Disclaimer: This is a draft blogpost. I am dumping links, quotes and thoughts here until I'm actually ready to finalize the blogpost.

Recommendations
My Hardware and Software choices

Recommendations

Source: https://unsloth.ai/docs/get-started/fine-tuning-llms-guide

4 major approaches:
- You'll also need to decide between normal/full fine-tuning, RL, QLoRA or LoRA training.
- We recommend starting with QLoRA, as it is one of the most accessible and effective methods for training models. Our dynamic 4-bit quants, the accuracy loss for QLoRA compared to LoRA is now largely recovered.
How many parameters in base model?
- If you're a beginner, it is best to start with a small instruct model like Llama 3.1 (8B) and experiment from there.
Research shows that training and serving in the same precision helps preserve accuracy. This means if you want to serve in 4-bit, train in 4-bit and vice versa.
Full fine tuning isn't needed
- Unsloth also supports full fine-tuning (FFT) and pretraining, which require significantly more resources, but FFT is usually unnecessary. When done correctly, LoRA can match FFT.

Source: https://unsloth.ai/docs/get-started/fine-tuning-llms-guide/what-model-should-i-use#instruct-or-base-model

Instruct models are pre-trained with built-in instructions, making them ready to use without any fine-tuning. These models, including GGUFs and others commonly available, are optimized for direct usage and respond effectively to prompts right out of the box. Instruct models work with conversational chat templates like ChatML or ShareGPT.

Source: https://unsloth.ai/docs/get-started/fine-tuning-llms-guide/what-model-should-i-use#should-i-choose-instruct-or-base

Should I Choose Instruct or Base?
- Less than 300 Rows: For smaller datasets, the instruct model is typically the better choice. Fine-tuning the instruct model enables it to align with specific needs while preserving its built-in instructional capabilities. This ensures it can follow general instructions without additional input unless you intend to significantly alter its functionality.

Source: https://unsloth.ai/docs/get-started/fine-tuning-llms-guide/datasets-guide#how-big-should-my-dataset-be

How big should my dataset be? We generally recommend using a bare minimum of at least 100 rows of data for fine-tuning to achieve reasonable results. For optimal performance, a dataset with over 1,000 rows is preferable, and in this case, more data usually leads to better outcomes. If your dataset is too small you can also add synthetic data or add a dataset from Hugging Face to diversify it. However, the effectiveness of your fine-tuned model depends heavily on the quality of the dataset, so be sure to thoroughly clean and prepare your data.

Source: https://unsloth.ai/docs/get-started/fine-tuning-llms-guide/what-model-should-i-use#fine-tuning-models-with-unsloth

We recommend starting with Instruct models, as they allow direct fine-tuning using conversational chat templates (ChatML, ShareGPT etc.) and require less data compared to Base models (which uses Alpaca, Vicuna etc).

My Hardware and Software choices

Using Macbook Air M5 32GB
Why Unsloth?
- No particular reason, it's a decent option from reddit reviews and seems easy to get started.
As recommended by Unsloth's docs, I will start with simplest setup first:
- I want to fine tune a small model first: Qwen 3.5 9B (around 8B as recommended by Unsloth)
  - But it seems that Qwen 3.5 9B doesn't has Instruct, so maybe a GGUF+4-bit Qwen3 8B model instead which seems to have Instruct variant https://unsloth.ai/docs/get-started/unsloth-model-catalog#qwen-models
- I will try QLoRA first as recommended by Unsloth docs.
- Minimum 100 rows for fine tuning, but possibly upto 1000 rows if huggingface or Codex can give training data in ChatML or ShareGPT format needed for Instruct models.
Import fine tune model in LM studio using this: https://lmstudio.ai/docs/app/advanced/import-model#lm-studios-expected-models-directory-structure

Fine tuning my first open LLM model

Table of Contents

Recommendations

My Hardware and Software choices

Recommended reads:

Related post

Skills that I use with Nanobot (Openclaw alternative that I'm using)

The LM Studio Model That Sent Me Down a Fine-Tuning Rabbit Hole

Featured Posts

Recommended Topics