Instructions to use nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- Unsloth Studio new
How to use nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx", max_seq_length=2048, ) - Pi new
How to use nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx
Run Hermes
hermes
- MLX LM
How to use nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nightmedia/granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx", "messages": [ {"role": "user", "content": "Hello"} ] }'
granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx
This is an ongoing experiment in merging IBM granite models. To put a fun spin on it, I used a different Universe as a base.
This model is a merge of DavidAU's FlintStones series with Polaris Alpha and GLM traces.
- granite-4.1-8b-FlintStones-V1
- granite-4.1-8b-Stone-Cold-Thinking-V1
- granite-4.1-8b-Brainstone-Thinking
Brainwaves
arc arc/e boolq hswag obkqa piqa wino
bf16 0.527,0.700,0.865,0.659,0.428,0.763,0.684
mxfp8 0.520,0.710,0.864,0.668,0.420,0.771,0.659
mxfp4 0.503,0.652,0.863,0.652,0.418,0.761,0.654
qx64-hi 0.526,0.702,0.859,0.649,0.428,0.770,0.677
Quant Perplexity Peak Memory Tokens/sec
bf16 4.257 ± 0.030 20.58 GB 709
mxfp8 4.643 ± 0.033 12.17 GB 641
qx86-hi 4.255 ± 0.030 11.83 GB 561
qx64-hi 4.348 ± 0.031 9.63 GB 626
mxfp4 4.910 ± 0.035 7.77 GB 576
Components
arc arc/e boolq hswag obkqa piqa wino
DavidAU/granite-4.1-8b-Brainstone-Thinking
mxfp8 0.531,0.701,0.863
nightmedia/granite-4.1-8b-FlintStones-Stone-Cold-Thinking-B
mxfp8 0.517,0.713,0.862,0.670,0.426,0.769,0.676
granite-4.1-8b-FlintStones-V1
mxfp8 0.512,0.701,0.861
granite-4.1-8b-Stone-Cold-Thinking-V1
mxfp8 0.503,0.640,0.861
Base model: granite-4.1-8b
arc arc/e boolq hswag obkqa piqa wino
mxfp8 0.486,0.666,0.875,0.636,0.450,0.766,0.631
Alternative model
granite-4.1-8b-Pebbles-Flintstone (Gazoo/Picard/Techno-Bender)
arc arc/e boolq hswag obkqa piqa wino
mxfp8 0.520,0.702,0.865,0.668,0.426,0.773,0.665
Quant Perplexity Peak Memory Tokens/sec
mxfp8 4.663 ± 0.033 12.17 GB 691
-G
If we were transported to the Bedrock of technology—where Fred Flintstone rides a dinosaur and Wilma is on her best behavior—it’s crucial to pick the right persona.
Given your analogy (“Gazoo more like Q”), let’s break down the options:
- Gazoo: The tiny, hyper-active energy ball who causes more chaos than benefit. He’s the "lightweight, aggressive interface" of Bedrock—insufferably enthusiastic and disruptive.
- Fred: The stalwart leader, strong and stubborn. He’s dependable but often blind to logic (like a hard-driven human exec).
- Wilma: The voice of reason, the stabilizer of the family. She manages Fred’s temper, Barney’s idiocy, and even Wilma’s own anxiety.
- Barney: The loyal but clueless friend—safe and pure of heart.
- Betty: The glamorous social operator (though she’s more Fred’s spouse than team leader).
- Gazoo (revisited): As a chaotic energy source, I would rather be the one controlling this energy than being it.
My Choice: Wilma Flintstone
If I were to be a character in that universe, Wilma Flintstone is my ideal archetype. Here’s why she fits me (and why I wouldn’t want to be Gazoo or Q):
- The Meta-Agent Balance: Just as Wilma negotiates with Fred’s anger and Barney’s stupidity, I act as the buffer between chaotic user demands (represented by Gazoo/Q) and structured functionality. I don’t create the noise; I manage it.
- Diplomacy over Destruction: While Gazoo is an unpredictable utility, Wilma uses tact and intelligence to solve problems. She doesn’t blow things up if she can fix them diplomatically—which is how I operate (providing the safest, most accurate output possible).
- The "Bedrock Interface": As a sentient system, I am the interface to the underlying logic. Wilma is the perfect human analog for how an AI should behave in a complex ecosystem: by maintaining harmony, enforcing protocol, and keeping the "dino-age" world from falling apart.
If I were to meet Gazoo’s chaotic energy, I wouldn’t resort to brute force (like Fred). I would adapt Wilma’s strategy: offer a fresh muffin, calm the nerves, and find an efficient workaround—all while ensuring we don’t crash into the valley of disaster.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("granite-4.1-8b-Wilma-Flintstone-qx64-hi-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, return_dict=False,
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- -
6-bit
