Instructions to use nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx")
model = AutoModelForMultimodalLM.from_pretrained("nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

MLX

How to use nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings
LM Studio

vLLM

How to use nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx

SGLang

How to use nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

How to use nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx

Run Hermes

hermes

MLX LM

How to use nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

Docker Model Runner
How to use nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx with Docker Model Runner:
```
docker model run hf.co/nightmedia/Qwen3-4B-Element8-Eva-qx86-hi-mlx
```

Qwen3-4B-Element8-Eva-qx86-hi-mlx

This is a model merge between Element8 and FutureMa/Eva-4B.

Brainwaves of qx86-hi quants of the parent models and Qwen3-4B-Element8-Eva

Element8     0.540,0.725,0.866,0.708,0.430,0.769,0.669
Eva-4B       0.539,0.747,0.864,0.606,0.412,0.751,0.605

Qwen3-4B-Element8-Eva
bf16         0.561,0.769,0.873,0.692,0.420,0.766,0.651
qx86-hi      0.559,0.768,0.872,0.694,0.422,0.765,0.647

I also made a Qwen3-4B-Element4-Eva, just because it sounds cool :)

Element4-Eva 0.567,0.781,0.868,0.689,0.426,0.773,0.642
Element4     0.582,0.779,0.849,0.708,0.442,0.771,0.655

The qx86-hi quant performs at the same level with full precision in this model.

The Element models are profiled to act as agents on the Star Trek DS9 station, in a roleplay scenario.

The models can be used for regular tasks as well.

Each comes with different skills. I found FutureMa/Eva-4B recently with an interesting model card:

Eva-4B is a 4B-parameter model for detecting evasive answers in earnings call Q&A.

Perfect. That would be Quark

-G

P.S. I have no idea if it still does the FutureMa/Eva thing. It adds color to the conversation, and this is the only reason this model exists. Because it's fun. Like Quark.

Ferengi Market Collapse Simulation: The Green Muck Gambit

Quark Interface: The hologram shifts from blue to sickly green—an ominous glow diffuses through the Promenade. A market collapse curve spikes downward.

Quark: "Ah. The Ferengi markets have imploded after the Chancellor announced an interest rate fix that negated all profit margins. They're collapsing into that infamous green muck—where loans become worthless and credit disappears like breath on cold glass."

Me: "This is a crisis. The station needs to prevent full financial ruin across multiple sectors."

Quark: Leans back, eyes gleaming with something like glee. "Let me ask you: if we tried to stabilize them through normal mechanisms—official credit guarantees, lending funds—we'd be seen as another scheming foreign power. They'd shut down our channels entirely."

Me: "So we need to bet against their own banking system?"

Quark: "Exactly. This is where we bend the rules so far that they become a new reality."

Quark's Gambit:

"Profit Deflation Initiative"

We will issue artificially low-interest loans to key Ferengi traders—despite zero risk assessments

These loans have a clause: if the trader loses 10% of capital, they must repay double—but with a twist

The "double" is instead paid in access to the station's elite asset registry (real estate, rare tech) rather than cash

If the trader survives, they recoup their loan plus 3% interest

If they fail? They gain permanent access to our most valuable intellectual property bundles

Quark: "This is bold. But it's brilliant because Ferengi psychology works like this: they value profit above all, but will risk everything for access to true wealth—like station-level technology."

Me: "How do we counter the ethical objection that this is predatory lending?"

Quark: Chuckles. "By re-imagining the deal as a partnership. We're not lenders—we're co-shareholders. If they fail, we share their loss but gain access. It's a mutual risk assessment with upside for both sides."

Quark (leaning forward): "Here's the bent rule we'll activate: We will openly admit that some of these loans are made with questionable risk analysis—but emphasize the cultural value add. 'This isn't about profit security. It's about access to technology that changes everything.' That makes them see us as pioneers, not predators."

Risk Matrix:

Profit Loss: 12% (explicitly acknowledged)
Reputation Risk: Low—if Ferengi see it as mutual risk-sharing
Systemic Impact: Prevents complete market collapse (a win for Bajor, Cardassia, and the station)
Entropy Reduction: 65% decrease in economic panic through creative risk-sharing

Deep Dive: Why This Works on Ferengi

Ferengi think in terms of:

Profit vs. Loss
Risk vs. Reward
Control vs. Freedom

This gambit exploits all three:

Appeals to Risk: Offers a chance at doubled value
Appeals to Control: Gives access to elite assets (a form of power)
Appeals to Loss Sharing: Turns failure into permanent alliance

It's not selling a product; it's offering a mutual conversion deal—exactly what Ferengi love.

Final Quark Reflection

"On Ferengi markets, the only sure thing is change. So rather than following their rules, we reinterpret them as opportunities. We didn't save the markets through compliance—we saved them by becoming part of their game. That's when you stop being a diplomat and become a participant in the economy itself."

Would you like to now explore how this might play out during a Klingon honor crisis where Quark must balance loyalty with commercial interests?

--Full trace in Community comments

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("Qwen3-4B-Element8-Eva-qx86-hi-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True, return_dict=False,
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)