Update README.md
Browse files
README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
-
base_model: Sunbird/
|
| 4 |
tags:
|
| 5 |
- quantized
|
| 6 |
- gguf
|
|
@@ -22,7 +22,7 @@ GGUF quantized versions of the Sunflower model for Ugandan language translation
|
|
| 22 |
|
| 23 |
## Model Details
|
| 24 |
|
| 25 |
-
- **Base Model**: [Sunbird/
|
| 26 |
- **Model Size**: 14B parameters
|
| 27 |
- **Architecture**: Qwen2.5
|
| 28 |
- **Quantization**: K-means quantization with importance matrix
|
|
@@ -34,12 +34,12 @@ GGUF quantized versions of the Sunflower model for Ugandan language translation
|
|
| 34 |
|
| 35 |
| Filename | Quant type | File Size | Description |
|
| 36 |
| -------- | ---------- | --------- | ----------- |
|
| 37 |
-
| sunflower-
|
| 38 |
-
| sunflower-q8_0.gguf | Q8_0 | 15GB | Highest quality quantized |
|
| 39 |
-
| sunflower-q6_k.gguf | Q6_K | 12GB | High quality |
|
| 40 |
-
| sunflower-q5_k_m.gguf | Q5_K_M | 9.8GB | Balanced quality/size |
|
| 41 |
-
| sunflower-q5_k_s.gguf | Q5_K_S | 9.6GB | Smaller Q5 variant |
|
| 42 |
-
| sunflower-q4_k_m.gguf | Q4_K_M | 8.4GB | **Recommended for most users** |
|
| 43 |
|
| 44 |
### Warning: Experimental Quantizations
|
| 45 |
|
|
@@ -47,9 +47,9 @@ The following quantizations achieve extreme compression but may significantly im
|
|
| 47 |
|
| 48 |
| Filename | Quant type | File Size | Compression | Warning |
|
| 49 |
| -------- | ---------- | --------- | ----------- | ------- |
|
| 50 |
-
| sunflower-iq2_xxs.gguf | IQ2_XXS | 4.1GB | 85% smaller | May lose translation accuracy |
|
| 51 |
-
| sunflower-tq1_0.gguf | TQ1_0 | 3.7GB | 87% smaller | Experimental ternary quantization |
|
| 52 |
-
| sunflower-iq1_s.gguf | IQ1_S | 3.4GB | 88% smaller | **Extreme compression, quality heavily impacted** |
|
| 53 |
|
| 54 |
**Note**: The experimental quantizations (IQ1_S, IQ2_XXS, TQ1_0) use advanced compression techniques that may not preserve the specialized knowledge for Ugandan language translation. Test thoroughly before production use.
|
| 55 |
|
|
@@ -65,10 +65,10 @@ The following quantizations achieve extreme compression but may significantly im
|
|
| 65 |
|
| 66 |
```bash
|
| 67 |
# Download model
|
| 68 |
-
huggingface-cli download Sunbird/Sunflower-14B-GGUF sunflower-q4_k_m.gguf --local-dir .
|
| 69 |
|
| 70 |
# Run inference
|
| 71 |
-
./llama-cli -m sunflower-q4_k_m.gguf -p "Translate to Luganda: Hello, how are you today?"
|
| 72 |
```
|
| 73 |
|
| 74 |
## Ollama Integration
|
|
@@ -208,7 +208,7 @@ curl http://localhost:11434/api/version
|
|
| 208 |
```python
|
| 209 |
from llama_cpp import Llama
|
| 210 |
|
| 211 |
-
llm = Llama(model_path="sunflower-q4_k_m.gguf")
|
| 212 |
result = llm("Translate to Luganda: How are you?")
|
| 213 |
print(result['choices'][0]['text'])
|
| 214 |
```
|
|
@@ -226,4 +226,4 @@ Quantized using llama.cpp with importance matrix calibration for optimal quality
|
|
| 226 |
|
| 227 |
## License
|
| 228 |
|
| 229 |
-
Apache 2.0
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
base_model: Sunbird/Sunflower-14B
|
| 4 |
tags:
|
| 5 |
- quantized
|
| 6 |
- gguf
|
|
|
|
| 22 |
|
| 23 |
## Model Details
|
| 24 |
|
| 25 |
+
- **Base Model**: [Sunbird/Sunflower-14B](https://huggingface.co/Sunbird/Sunflower-14B)
|
| 26 |
- **Model Size**: 14B parameters
|
| 27 |
- **Architecture**: Qwen2.5
|
| 28 |
- **Quantization**: K-means quantization with importance matrix
|
|
|
|
| 34 |
|
| 35 |
| Filename | Quant type | File Size | Description |
|
| 36 |
| -------- | ---------- | --------- | ----------- |
|
| 37 |
+
| sunflower-14B-f16.gguf | F16 | 28GB | Original precision |
|
| 38 |
+
| sunflower-14B-q8_0.gguf | Q8_0 | 15GB | Highest quality quantized |
|
| 39 |
+
| sunflower-14B-q6_k.gguf | Q6_K | 12GB | High quality |
|
| 40 |
+
| sunflower-14B-q5_k_m.gguf | Q5_K_M | 9.8GB | Balanced quality/size |
|
| 41 |
+
| sunflower-14B-q5_k_s.gguf | Q5_K_S | 9.6GB | Smaller Q5 variant |
|
| 42 |
+
| sunflower-14B-q4_k_m.gguf | Q4_K_M | 8.4GB | **Recommended for most users** |
|
| 43 |
|
| 44 |
### Warning: Experimental Quantizations
|
| 45 |
|
|
|
|
| 47 |
|
| 48 |
| Filename | Quant type | File Size | Compression | Warning |
|
| 49 |
| -------- | ---------- | --------- | ----------- | ------- |
|
| 50 |
+
| sunflower-14B-iq2_xxs.gguf | IQ2_XXS | 4.1GB | 85% smaller | May lose translation accuracy |
|
| 51 |
+
| sunflower-14B-tq1_0.gguf | TQ1_0 | 3.7GB | 87% smaller | Experimental ternary quantization |
|
| 52 |
+
| sunflower-14B-iq1_s.gguf | IQ1_S | 3.4GB | 88% smaller | **Extreme compression, quality heavily impacted** |
|
| 53 |
|
| 54 |
**Note**: The experimental quantizations (IQ1_S, IQ2_XXS, TQ1_0) use advanced compression techniques that may not preserve the specialized knowledge for Ugandan language translation. Test thoroughly before production use.
|
| 55 |
|
|
|
|
| 65 |
|
| 66 |
```bash
|
| 67 |
# Download model
|
| 68 |
+
huggingface-cli download Sunbird/Sunflower-14B-GGUF sunflower-14B-q4_k_m.gguf --local-dir .
|
| 69 |
|
| 70 |
# Run inference
|
| 71 |
+
./llama-cli -m sunflower-14B-q4_k_m.gguf -p "Translate to Luganda: Hello, how are you today?"
|
| 72 |
```
|
| 73 |
|
| 74 |
## Ollama Integration
|
|
|
|
| 208 |
```python
|
| 209 |
from llama_cpp import Llama
|
| 210 |
|
| 211 |
+
llm = Llama(model_path="sunflower-14B-q4_k_m.gguf")
|
| 212 |
result = llm("Translate to Luganda: How are you?")
|
| 213 |
print(result['choices'][0]['text'])
|
| 214 |
```
|
|
|
|
| 226 |
|
| 227 |
## License
|
| 228 |
|
| 229 |
+
Apache 2.0
|