Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
majentik
/
Nemotron-3-Nano-4B-TurboQuant-MLX-4bit
like
0
Text Generation
MLX
Safetensors
English
nemotron_h
turboquant
kv-cache-quantization
nemotron
nvidia
mamba2
hybrid
quantized
4bit
conversational
custom_code
4-bit precision
arxiv:
2504.19874
License:
nvidia-open-model-license
Model card
Files
Files and versions
xet
Community
Use this model
main
Nemotron-3-Nano-4B-TurboQuant-MLX-4bit
2.25 GB
Ctrl+K
Ctrl+K
1 contributor
History:
4 commits
majentik
chore(card): enrich YAML frontmatter (pipeline_tag, language, library_name, inference)
b9738fe
verified
10 days ago
.gitattributes
Safe
1.57 kB
Add MLX quantized model weights
13 days ago
README.md
Safe
4.32 kB
chore(card): enrich YAML frontmatter (pipeline_tag, language, library_name, inference)
10 days ago
__init__.py
Safe
0 Bytes
Add MLX quantized model weights
13 days ago
chat_template.jinja
Safe
10.5 kB
Add MLX quantized model weights
13 days ago
config.json
Safe
1.77 kB
Add MLX quantized model weights
13 days ago
configuration_nemotron_h.py
Safe
12.1 kB
Add MLX quantized model weights
13 days ago
generation_config.json
Safe
188 Bytes
Add MLX quantized model weights
13 days ago
model.safetensors
2.24 GB
xet
Add MLX quantized model weights
13 days ago
model.safetensors.index.json
Safe
31.3 kB
Add MLX quantized model weights
13 days ago
modeling_nemotron_h.py
Safe
78.6 kB
Add MLX quantized model weights
13 days ago
nano_v3_reasoning_parser.py
Safe
798 Bytes
Add MLX quantized model weights
13 days ago
tokenizer.json
Safe
17.1 MB
xet
Add MLX quantized model weights
13 days ago
tokenizer_config.json
372 Bytes
Add MLX quantized model weights
13 days ago