patrickcmd commited on
Commit
54680fd
·
verified ·
1 Parent(s): 6f89a1f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -15
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  license: apache-2.0
3
- base_model: Sunbird/qwen3-14b-sunflower-merged
4
  tags:
5
  - quantized
6
  - gguf
@@ -22,7 +22,7 @@ GGUF quantized versions of the Sunflower model for Ugandan language translation
22
 
23
  ## Model Details
24
 
25
- - **Base Model**: [Sunbird/qwen3-14b-sunflower-merged](https://huggingface.co/Sunbird/qwen3-14b-sunflower-merged)
26
  - **Model Size**: 14B parameters
27
  - **Architecture**: Qwen2.5
28
  - **Quantization**: K-means quantization with importance matrix
@@ -34,12 +34,12 @@ GGUF quantized versions of the Sunflower model for Ugandan language translation
34
 
35
  | Filename | Quant type | File Size | Description |
36
  | -------- | ---------- | --------- | ----------- |
37
- | sunflower-merged-f16.gguf | F16 | 28GB | Original precision |
38
- | sunflower-q8_0.gguf | Q8_0 | 15GB | Highest quality quantized |
39
- | sunflower-q6_k.gguf | Q6_K | 12GB | High quality |
40
- | sunflower-q5_k_m.gguf | Q5_K_M | 9.8GB | Balanced quality/size |
41
- | sunflower-q5_k_s.gguf | Q5_K_S | 9.6GB | Smaller Q5 variant |
42
- | sunflower-q4_k_m.gguf | Q4_K_M | 8.4GB | **Recommended for most users** |
43
 
44
  ### Warning: Experimental Quantizations
45
 
@@ -47,9 +47,9 @@ The following quantizations achieve extreme compression but may significantly im
47
 
48
  | Filename | Quant type | File Size | Compression | Warning |
49
  | -------- | ---------- | --------- | ----------- | ------- |
50
- | sunflower-iq2_xxs.gguf | IQ2_XXS | 4.1GB | 85% smaller | May lose translation accuracy |
51
- | sunflower-tq1_0.gguf | TQ1_0 | 3.7GB | 87% smaller | Experimental ternary quantization |
52
- | sunflower-iq1_s.gguf | IQ1_S | 3.4GB | 88% smaller | **Extreme compression, quality heavily impacted** |
53
 
54
  **Note**: The experimental quantizations (IQ1_S, IQ2_XXS, TQ1_0) use advanced compression techniques that may not preserve the specialized knowledge for Ugandan language translation. Test thoroughly before production use.
55
 
@@ -65,10 +65,10 @@ The following quantizations achieve extreme compression but may significantly im
65
 
66
  ```bash
67
  # Download model
68
- huggingface-cli download Sunbird/Sunflower-14B-GGUF sunflower-q4_k_m.gguf --local-dir .
69
 
70
  # Run inference
71
- ./llama-cli -m sunflower-q4_k_m.gguf -p "Translate to Luganda: Hello, how are you today?"
72
  ```
73
 
74
  ## Ollama Integration
@@ -208,7 +208,7 @@ curl http://localhost:11434/api/version
208
  ```python
209
  from llama_cpp import Llama
210
 
211
- llm = Llama(model_path="sunflower-q4_k_m.gguf")
212
  result = llm("Translate to Luganda: How are you?")
213
  print(result['choices'][0]['text'])
214
  ```
@@ -226,4 +226,4 @@ Quantized using llama.cpp with importance matrix calibration for optimal quality
226
 
227
  ## License
228
 
229
- Apache 2.0
 
1
  ---
2
  license: apache-2.0
3
+ base_model: Sunbird/Sunflower-14B
4
  tags:
5
  - quantized
6
  - gguf
 
22
 
23
  ## Model Details
24
 
25
+ - **Base Model**: [Sunbird/Sunflower-14B](https://huggingface.co/Sunbird/Sunflower-14B)
26
  - **Model Size**: 14B parameters
27
  - **Architecture**: Qwen2.5
28
  - **Quantization**: K-means quantization with importance matrix
 
34
 
35
  | Filename | Quant type | File Size | Description |
36
  | -------- | ---------- | --------- | ----------- |
37
+ | sunflower-14B-f16.gguf | F16 | 28GB | Original precision |
38
+ | sunflower-14B-q8_0.gguf | Q8_0 | 15GB | Highest quality quantized |
39
+ | sunflower-14B-q6_k.gguf | Q6_K | 12GB | High quality |
40
+ | sunflower-14B-q5_k_m.gguf | Q5_K_M | 9.8GB | Balanced quality/size |
41
+ | sunflower-14B-q5_k_s.gguf | Q5_K_S | 9.6GB | Smaller Q5 variant |
42
+ | sunflower-14B-q4_k_m.gguf | Q4_K_M | 8.4GB | **Recommended for most users** |
43
 
44
  ### Warning: Experimental Quantizations
45
 
 
47
 
48
  | Filename | Quant type | File Size | Compression | Warning |
49
  | -------- | ---------- | --------- | ----------- | ------- |
50
+ | sunflower-14B-iq2_xxs.gguf | IQ2_XXS | 4.1GB | 85% smaller | May lose translation accuracy |
51
+ | sunflower-14B-tq1_0.gguf | TQ1_0 | 3.7GB | 87% smaller | Experimental ternary quantization |
52
+ | sunflower-14B-iq1_s.gguf | IQ1_S | 3.4GB | 88% smaller | **Extreme compression, quality heavily impacted** |
53
 
54
  **Note**: The experimental quantizations (IQ1_S, IQ2_XXS, TQ1_0) use advanced compression techniques that may not preserve the specialized knowledge for Ugandan language translation. Test thoroughly before production use.
55
 
 
65
 
66
  ```bash
67
  # Download model
68
+ huggingface-cli download Sunbird/Sunflower-14B-GGUF sunflower-14B-q4_k_m.gguf --local-dir .
69
 
70
  # Run inference
71
+ ./llama-cli -m sunflower-14B-q4_k_m.gguf -p "Translate to Luganda: Hello, how are you today?"
72
  ```
73
 
74
  ## Ollama Integration
 
208
  ```python
209
  from llama_cpp import Llama
210
 
211
+ llm = Llama(model_path="sunflower-14B-q4_k_m.gguf")
212
  result = llm("Translate to Luganda: How are you?")
213
  print(result['choices'][0]['text'])
214
  ```
 
226
 
227
  ## License
228
 
229
+ Apache 2.0