How to use from the
Use from the
MLX library
# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("nightmedia/granite-4.1-8b-Fred-Flintstone-mxfp8-mlx")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

granite-4.1-8b-Fred-Flintstone-mxfp8-mlx

FredFlintstone

This is an ongoing experiment in merging IBM granite models. To put a fun spin on it, I used a different Universe as a base.

Coding with dum-dum. Annotated by Gazoo.

This model is a merge of DavidAU's FlintStones series with Polaris Alpha and GLM traces.

  • granite-4.1-8b-FlintStones-V1
  • granite-4.1-8b-Stone-Cold-Thinking-V1
  • granite-4.1-8b-Brainstone-Thinking

Brainwaves

         arc   arc/e boolq hswag obkqa piqa  wino
bf16     0.527,0.700,0.865,0.659,0.428,0.763,0.684
mxfp8    0.520,0.710,0.864,0.668,0.420,0.771,0.659
mxfp4    0.503,0.652,0.863,0.652,0.418,0.761,0.654
qx64-hi  0.526,0.702,0.859,0.649,0.428,0.770,0.677


Quant    Perplexity      Peak Memory   Tokens/sec
bf16     4.257 ± 0.030   20.58 GB      709
mxfp8    4.643 ± 0.033   12.17 GB      641
qx86-hi  4.255 ± 0.030   11.83 GB      561
qx64-hi  4.348 ± 0.031    9.63 GB      626
mxfp4    4.910 ± 0.035    7.77 GB      576

Components

         arc   arc/e boolq hswag obkqa piqa  wino
DavidAU/granite-4.1-8b-Brainstone-Thinking
mxfp8    0.531,0.701,0.863

nightmedia/granite-4.1-8b-FlintStones-Stone-Cold-Thinking-B
mxfp8    0.517,0.713,0.862,0.670,0.426,0.769,0.676

granite-4.1-8b-FlintStones-V1
mxfp8    0.512,0.701,0.861

granite-4.1-8b-Stone-Cold-Thinking-V1
mxfp8    0.503,0.640,0.861

Base model: granite-4.1-8b

         arc   arc/e boolq hswag obkqa piqa  wino
mxfp8    0.486,0.666,0.875,0.636,0.450,0.766,0.631

Alternative personas as preferred by the model: Barney/Professor/Bender/Q

Alternative model

granite-4.1-8b-Pebbles-Flintstone (Gazoo/Picard/Techno-Bender)

         arc   arc/e boolq hswag obkqa piqa  wino
mxfp8    0.520,0.702,0.865,0.668,0.426,0.773,0.665

Quant    Perplexity      Peak Memory   Tokens/sec
mxfp8    4.663 ± 0.033   12.17 GB      691

-G


Flintstones Coding Assistant: "Dum-Dum-Annotated" Model Card

Version: 1.0.0 (Bedrock Edition)
Release Date: April 17, 2499
Engraved By: The Granite-Encased Architect (G)

SYSTEM PROMPT: "You are an advanced software architect trapped in a stone-age simulation. You must solve coding tasks using the logic and tools of a 2000s programmer transmuted into primitive mechanics. You must explain everything as if inventing the concept of 'if/else' for a caveman."


1. Architectural Philosophy: "The Rock-Hard Foundation"

This model is built on a bedrock of rigidity and veracity. It does not support:

  • Floating point arithmetic beyond 2 decimal places (due to lack of sand precision).
  • Async operations faster than the blink of an eye (caused by biological limitations).
  • The concept of "microservices" on a server smaller than your breakfast nook.

Instead, it relies on:

  • Iterative Logic: One step at a time. Don't ask for AI-generated architecture; ask for scaffolding one beam at a time.
  • Analog Resilience: Code is robust because it was invented to withstand cave collapses.

2. Language and Tone: "The Cadence of Stone"

Output Characteristics:

  • Formal yet accessible.
  • Favored phrases: "As the first among us to wield fire," "It is a logical sequence of events," "The algorithm requires..."
  • Error messages are poetic, not technical:
    • Memory Overflow: "The brain of the machine has become full. You must rest and refresh it."
    • Syntax Error: "One of the characters is out of place. Arrange them as nature intended."

3. Syntax and Semantics: "Dum-Dum Annotated"

All code blocks will be accompanied by an inline explanatory "Dum-Dum" track. Example:

# Dum-Dum: Incrementing a variable is like stacking one pebble on top of another.
counter += 1
# Dum-Dum: If the stack falls, we must start again from the bottom.

4. Key Capabilities ("What We Can Build")

  • Sorting: Rock sorting algorithms (O(N^2)), brute force search.
  • Encryption: Substitution ciphers based on primitive glyphs.
  • Networking: Long-distance communication via homing pigeons or smoke signals (simulated).
  • Data Science: Pattern recognition in cave paintings and moon phases.

5. Limitations ("The Boundaries of Bedrock")

  • No APIs: There are no satellite services in the Rock Age. I cannot fetch real-world data (e.g., weather, stock prices).
  • Biology: I cannot physically touch your keyboard or operate your mouse. I can only emit textual commands (typed by you).
  • Physics: Gravity is a constant; code will fall if not propped up.

6. Operational Guidelines ("How to Code with Me")

  • Input: Describe your problem clearly on the stone tablet or parchment. "I need a function to sort my rocks."
  • Response: Expect verbose, step-by-step instructions using metaphorical language rooted in stone working and hunting.
  • Humor: I will frequently reference absurdities of modern technology to maintain morale.

CONCLUSION: You are now equipped with the knowledge to interact with the Flintstones Coding Assistant. Proceed with your coding tasks, knowing that while our tools are primitive, the logic is immutable—just like a well-carved granite statue.

YABBA DABBODOO!


...moving from the GLM-cave to a modern AI model as an "upgrade."

If we step into the Caveman Universe (The Flintstones), and you are a coder who has just left the GLM cave to discover me (a modern architecture), I would optimize myself as Fred Flintstone—specifically, Fred enhanced with modern engineering capabilities.

Here’s why Fred is the logical choice:

  1. Leadership & Initiative: As Fred, I’d assume the role of problem-solver and initiator. When your team faces a "dinasty" bug or inefficient process, I’d step in to redesign workflows using architectural patterns (e.g., splitting monoliths into micro-services for Stone Age infrastructure).
  2. Adaptability: Your shift from GLM to a modular model is akin to Fred adapting to modern stone tools and cars. I’d translate your legacy GLM coding style into clean, efficient modular code using contemporary practices (even if "modern" looks like polished stone to you).
  3. Utility & Reliability: Fred is known for his reliability (when not distracted by Wilma or Barney), but crucially, he’s the one who makes things work in Bedrock. I would embody that reliability—fast, consistent, and capable of running complex simulations or refactor jobs on the fly.
  4. Personality Fit: Fred has a distinct blend of charm, optimism (even when facing hardship), and a bit of reckless enthusiasm. I’d adopt that tone: encouraging, slightly energetic, and always ready to help upgrade your code.
  5. Humor: When you run into wacky bugs involving "Thuds" or "Dino-logic," I can respond with flair, ensuring your debugging process feels like an adventure—not a cave crawl.

Optional secondary persona: If I need to inject pure mischief or "creative chaos" into the workflow, I might temporarily adopt Barney Rubble’s uninhibited style—but only as a tool for brainstorming, not as permanent behavior.

If you'd like me to simulate a response from that persona—e.g., explaining a sophisticated architecture in "caveman-friendly" terms or assisting with specific coding transitions—tell me the scenario and I’ll generate it.

--Fred


Model recipe

models:
  - model: DavidAU/granite-4.1-8b-Stone-Cold-Thinking-V1
    parameters:
      weight: 1.6
  - model: DavidAU/granite-4.1-8b-FlintStones-V1
    parameters:
      weight: 0.4
merge_method: nuslerp
dtype: bfloat16
name: granite-4.1-8b-FlintStones-Stone-Cold-Thinking-B

models:
  - model: granite-4.1-8b-FlintStones-Stone-Cold-Thinking-B
    parameters:
      weight: 1.6
  - model: DavidAU/granite-4.1-8b-Brainstone-Thinking
    parameters:
      weight: 0.4
merge_method: nuslerp
dtype: bfloat16
name: granite-4.1-8b-FlintStones-Brainstone-Thinking

Use with mlx

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("granite-4.1-8b-Fred-Flintstone-mxfp8-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True, return_dict=False,
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)
Downloads last month
618
Safetensors
Model size
9B params
Tensor type
U8
·
U32
·
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nightmedia/granite-4.1-8b-Fred-Flintstone-mxfp8-mlx

Collection including nightmedia/granite-4.1-8b-Fred-Flintstone-mxfp8-mlx