Fine-tune Qwen 3 on a custom dataset
Qwen 3 is one of the strongest open-weight models for reasoning and multilingual tasks. This guide shows how to fine-tune Qwen 3 7B on your own dataset with Soup CLI.
1. Install
bash
pip install 'soup-cli[fast]'2. Dataset (ShareGPT format)
Qwen 3 handles multi-turn conversations well. Use the ShareGPT format:
json
[
{
"conversations": [
{"from": "user", "value": "Explain gradient descent in one paragraph."},
{"from": "assistant", "value": "Gradient descent is..."}
]
}
]3. Config
yaml
base:
model: Qwen/Qwen3-7B-Instruct
task: sft
data:
train: conversations.json
format: sharegpt
training:
backend: unsloth
epochs: 2
learning_rate: 1.5e-4
batch_size: 4
gradient_accumulation_steps: 4
max_seq_length: 4096
lora:
enabled: true
r: 32
alpha: 64
target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]Why target all projections? Qwen 3 benefits from training LoRA on both attention and MLP projections for instruction-following tasks.
4. Train
bash
soup train --config qwen3.yaml5. Serve with vLLM
bash
pip install 'soup-cli[serve-fast]'
soup serve --adapter ./runs/qwen3/latest --backend vllm --port 8000The model is now available at http://localhost:8000/v1/chat/completions with OpenAI-compatible API.
Tips
- Qwen 3 uses a 151k vocab — stick to
max_seq_length: 4096or higher for best results. - Use
training.neftune_alpha: 5to inject noise into embeddings — improves generalization on small datasets. - For reasoning tasks, try
training.packing: truefor efficient long-context training.
Related
- [DPO alignment guide](/docs/dpo-training-guide)
- [Data formats reference](/docs/data-formats)