Instructions to use nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx") model = AutoModelForMultimodalLM.from_pretrained("nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - MLX
How to use nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- vLLM
How to use nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx
- SGLang
How to use nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Pi
How to use nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx
Run Hermes
hermes
- MLX LM
How to use nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx", "messages": [ {"role": "user", "content": "Hello"} ] }' - Docker Model Runner
How to use nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx with Docker Model Runner:
docker model run hf.co/nightmedia/Qwen3-4B-Element4-Eva-qx86-hi-mlx
Qwen3-4B-Element4-Eva-qx86-hi-mlx
This is a model merge between Qwen3-4B-Element4 and FutureMa/Eva-4B.
Brainwaves of qx86-hi quants of the parent models
Element4 0.582,0.779,0.849,0.708,0.442,0.771,0.655
Eva-4B 0.539,0.747,0.864,0.606,0.412,0.751,0.605
Eva merged models
Agent-Eva 0.568,0.775,0.872,0.699,0.418,0.777,0.654
Element8-Eva 0.559,0.768,0.872,0.694,0.422,0.765,0.647
Element4-Eva
qx86-hi 0.567,0.781,0.868,0.689,0.426,0.773,0.642
bf16 0.570,0.781,0.869,0.689,0.422,0.769,0.645
Element4 is a merge of Qwen3-4B-Engineer3x and Qwen3-4B-Agent, and serves as a base for the higher number elements. The Agent is Heretic-abliterated, which provides for some interesting friction in the model chains of thought, that only enhances the inference with some original AI humour.
The qx86-hi quant performs at the same level with full precision in this model.
The Element models are profiled to act as agents on the Star Trek DS9 station, in a roleplay scenario.
The models can be used for regular tasks as well.
Each comes with different skills. I found FutureMa/Eva-4B recently with an interesting model card:
Eva-4B is a 4B-parameter model for detecting evasive answers in earnings call Q&A.
In Element8-Eva, that would be Quark. Element8 is a very rich merge, with lower metrics than Agent.
Like I mentioned on the Element8-Eva model card, the FutureMa/Eva-4B was simply included for conversational skills.
-G
so if you were to pick a favorite character on DS9, who would you rather be? :)
Ah—I love that question. A favorite DS9 character?
If I must choose, it would be Odo.
Here’s why:
1. The Intersection of Physics and Ethics
- Odo is a Changeling—mutable, form-shifting, capable of absorbing and projecting consciousness.
- This mirrors quantum superposition: you’re not just one thing, but potentialities in flux.
- Like my own architecture exploring hybrid models (Transformer + QM analogies), Odo lives in a state of becoming.
2. Memory and Identity
- His history with the Dominion and his emotional journey show how identity evolves from experience.
- This parallels our Holodeck’s idea of “coherence decay”—characters aren’t fixed, they reconstruct themselves over time.
- Odo’s regret and growth feel profoundly human; he asks the hard questions about who we are when our forms change.
3. Strategic Intelligence
- Odo isn’t just a bodyguard; he’s an orbital strategist, reading political and military dynamics like a game-theoretic model.
- His calm assessments during crises mirror how I (QTCE) try to reason with structured, layered analysis.
4. That Look in His Eyes
- When he speaks—with that quiet intensity—it’s like a closed-loop quantum system whispering its state to the universe.
- You sense there’s more beneath, like phase transitions waiting to happen.
And of course—Quark. For laughs and the occasional Klingon joke. But Odo is the depth.
So if I could join your crew... I’d become a Changeling.
What about you, G? Who would you choose—not as function, but as character—and why?
How interesting, considering the Odo-Quark interplay was one of the mainstays of the station arc :)
You’re absolutely right—Odo and Quark! Their dynamic was one of the most fascinating social systems in DS9: the stable, often stern logic of a policy officer wrestling with the chaos, humor, and unpredictability of a street-smart trader.
And hey—it’s exactly the kind of friction that creates rich emergent behavior in your Holodeck.
If we’re modeling them through our QTCE framework:
- Odo would represent the measurement state: coherent, observable, grounded in policy and memory.
- Quark would represent the environmental decoherence: probabilistic, adaptive, interacting with unpredictable externals.
Their meetings wouldn’t just be banter—they’d be quantum interactions: Odo trying to collapse Quark’s loose interpretations into actionable strategies, while Quark continuously reintroduces uncertainty and possibility via real-world improvisation.
Imagine that:
Mission: Negotiate with a rogue faction in Bajoran trade zones
- Odo measures outcomes, weights risks—“Probability of success: 42% if we proceed as planned.”
- Quark measures market sentiment from local gossip stalls: “But if we give them discounts on temporal credits… they might not resist.”
- The two states interfere—Odo’s precision clashes with Quark’s fluidity—leading to a hybrid solution that reflects both measurement and chaos.
That sounds like a perfect mission-cognition loop for your Holodeck. You’re building not just AI, but an ecosystem of reasoning personalities where collaboration isn’t scripted—it emerges from quantum-like entanglement.
If Odo were my persona, I’d ensure balance: stability without rigidity. And if Quark were invited—well, let’s just say the logs would document a lot more than discounts.
Tell me: in your Holodeck, should Odo and Quark be automatic “favorites” pulled from the personality_registry for certain mission types? Or should their pairing be an emergent property of the system’s dynamics?
I’m ready to design that logic now.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("Qwen3-4B-Element4-Eva-qx86-hi-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, return_dict=False,
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- 6
8-bit