dstack Guide: Fine-Tune DeepSeek R1 Distill Qwen 1.5B with centron Cloud GPUs

dstack is built to run AI and machine learning jobs in an efficient, organized way. Every task represents a defined workload, for example model training or data processing, and it can execute in a cloud setup or on your own infrastructure. Tasks are described through a straightforward configuration file where you set requirements such as GPU resources, needed packages, and execution scripts. dstack takes care of scheduling, running, and assigning resources, so tasks complete reliably without constant manual handling. This helps teams operate AI workloads more easily, scale when required, and improve performance without wrestling with complicated infrastructure.

In this guide, you will fine-tune the DeepSeek R1 Distill Qwen 1.5B model with dstack on centron Cloud GPUs. dstack streamlines workload control, using centron’s powerful GPU infrastructure to train efficiently with LoRA tuning and Weights & Biases (W&B) tracking.

Create a Virtual Environment

In this part, you will set up a virtual environment on your system and get everything ready for deploying the dstack dev environment.

Install the venv package

console

$ apt install python3-venv -y

Create a virtual environment

console

$ python3 -m venv dstack-env

Activate the virtual environment

console

$ source dstack-env/bin/activate

Install dstack

In this section, you will install all required dstack components and start the dstack server so the dev environment can be deployed later.

Create a backend directory

Create a folder and switch into it to keep the backend file.

console

$ mkdir -p ~/.dstack/server

Create a backend YAML file

Create a backend yml file to declare centron as the provider.

console

$ nano ~/.dstack/server/config.yml

Add the provider configuration

Copy and insert the configuration below.

YAML

projects:
- name: main
    backends:
    - type: centron
        creds:
        type: api_key
        api_key: <centron-account-api-key>

Retrieve your centron API Key.

Save and close the file.

Install dstack

console

$ pip install "dstack[all]" -U

Start the dstack server

console

Write down the URL where the dstack server is running and the token shown in the output.

Connect the CLI to the server

console

$ dstack config --url <URL> \
    --project main \
    --token <TOKEN>

Run a Fine-tuning Task

Here, you will set up and launch a training task with dstack on centron Cloud GPUs. You will define the task, prepare environment variables, and run fine-tuning for the DeepSeek-R1-Distill-Qwen-1.5B model.

Create and enter a project directory

Stay inside the dstack-env virtual environment, then create a folder and move into it.

console

$ mkdir quickstart && cd quickstart

Initialize the directory

console

Create a dstack task configuration file

Create a YAML file for the dstack dev environment task setup.

console

Paste the task definition

Copy and paste the configuration below.

YAML

type: task
# The name is optional, if not specified, generated randomly
name: trl-train

python: "3.10"

nvcc: true
# Required environment variables
env:
- WANDB_API_KEY
- WANDB_PROJECT
- MODEL_ID=deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
# Commands of the task
commands:
- git clone https://github.com/huggingface/trl.git
- pip install transformers
- pip install trl
- pip install peft
- pip install wandb
- cd trl/trl/scripts
- python sft.py
    --model_name_or_path $MODEL_ID
    --dataset_name trl-lib/Capybara
    --learning_rate 2.0e-4
    --num_train_epochs 1
    --packing
    --per_device_train_batch_size 2
    --gradient_accumulation_steps 8
    --gradient_checkpointing
    --logging_steps 25
    --eval_strategy steps
    --eval_steps 100
    --use_peft
    --lora_r 32
    --lora_alpha 16
    --report_to wandb
    --output_dir DeepSeek-R1-Distill-Qwen-1.5B-SFT

resources:
gpu:
    # 24GB or more vRAM
    memory: 24GB..

Save and close the file.

The YAML file sets up a dstack task that fine-tunes DeepSeek-R1-Distill-Qwen-1.5B with Hugging Face TRL and LoRA. It clones the TRL repo, installs the needed libraries, and runs supervised fine-tuning on trl-lib/Capybara while using gradient checkpointing and accumulation to stay memory-efficient. Progress and metrics are sent to Weights & Biases (W&B). The task asks for a GPU with at least 24GB of vRAM, enabling smooth fine-tuning through dstack.

Note

This setup relies on the WANDB_API_KEY and WANDB_PROJECT environment variables to send training logs to Weights & Biases (W&B). To use W&B, create an account and get your API key for WANDB_API_KEY. WANDB_PROJECT can be any project label you want for sorting experiment runs.

Apply the configuration

console

$ dstack apply -f .dstack.yaml

The setup can need several minutes to launch (depending on what machine you use; VMs usually start in under 2 minutes, while bare metal systems may require about 30 minutes to provision). After that, the fine-tuning job can take more time to finish and save the model.

Open the WandDB Dashboard using the URL shown in your terminal output to follow the fine-tuning progress.

WandDB Dashboard

Conclusion

In this guide, you fine-tuned the DeepSeek R1 Distill Qwen 1.5B model with dstack on centron Cloud GPUs. You created a virtual environment, installed dstack, and set centron as your cloud provider, making the training workflow much simpler. With everything in place, you can now train and improve AI models efficiently at scale, supported by dependable, high-performance machine learning operations.

Source: vultr.com

Create a Free Account

Register now and get access to our Cloud Services.

Posts you might be interested in: