Post-merge workflow

After qlora-merge, you have a standalone fp16 model directory. Use this workflow to produce GGUF artifacts and publish them.

Prerequisites

Create a GGUF base file from merged weights.

Step	Tool	What it produces
Convert to GGUF	`llama.cpp/convert_hf_to_gguf.py`	`model-F16.gguf`

Success criteria: model-F16.gguf exists.

Generate inference-friendly quant files.

Step	Tool	What it produces
Quantize	`llama-quantize`	`Q4_K_M` · `Q5_K_M` · `Q6_K` · `Q8_0`

Success criteria: at least one quantized GGUF file (commonly Q4_K_M) is created.

Publish GGUF output directory.

Step	Tool	What it produces
Upload	`qwen35-upload`	HuggingFace Hub repo

Success criteria: Hub repo contains expected GGUF files.