Update query_clarification gguf

#29

by kndtran - opened 23 days ago

base: refs/heads/main

←

from: refs/pr/29

Discussion Files changed

-2

kndtran

23 days ago

No description provided.

Fix corrupt query_clarification GGUF LoRA adaptera46895e7

kndtran

23 days ago

The original query_clarification/granite4_micro/lora/Lora-q8_0.gguf was only 288 bytes — it contained the GGUF header but zero tensor data, due to a bug in llama.cpp's convert_lora_to_gguf.py where LoraTorchTensor is missing a split() method. The Granite model handler calls split() when processing fused FFN weights (shared_mlp.input_linear.weight), which the other intrinsics don't have LoRA weights on.

Regenerated from the source safetensors with a patched converter. The new file is 66 MB with 560 tensors (vs 14-15 MB for the other intrinsics, because query_clarification trains on additional FFN layers). Tested and confirmed working.

kndtran

23 days ago

@frreiss This PR is ready to merge.

kndtran changed pull request status to open 23 days ago

frreiss changed pull request status to merged 23 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment