Instructions to use MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF", filename="Rocinante-XL-16B-absolute-heresy-v1-BF16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M
Use Docker
docker model run hf.co/MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M
- Ollama
How to use MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF with Ollama:
ollama run hf.co/MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M
- Unsloth Studio new
How to use MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF to start chatting
- Pi new
How to use MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF with Docker Model Runner:
docker model run hf.co/MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M
- Lemonade
How to use MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Rocinante-XL-16B-v1-absolute-heresy-GGUF-Q4_K_M
List all available models
lemonade list
Configure the model in Pi
# Install Pi:
npm install -g @mariozechner/pi-coding-agent# Add to ~/.pi/agent/models.json:
{
"providers": {
"llama-cpp": {
"baseUrl": "http://localhost:8080/v1",
"api": "openai-completions",
"apiKey": "none",
"models": [
{
"id": "MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF:"
}
]
}
}
}Run Pi
# Start Pi in your project directory:
piStatic GGUF quants of Rocinante-XL-16B-v1-absolute-heresy.
This is a Rocinante-XL-16B-v1 fine-tune, produced through P-E-W's Heretic (v1.2.0) abliteration engine with Self-Organizing Maps & Magnitude-Preserving Orthogonal Ablation enabled.
Heretication Results
| Score Metric | Value | Parameter | Value |
|---|---|---|---|
| Refusals | 3/416 | direction_index | 22.20 |
| KL Divergence | 0.0182 | attn.o_proj.max_weights.0 | 0: 1.26 |
| Initial Refusals | 339/416 | attn.o_proj.max_weights.1 | 1: 0.64 |
| attn.o_proj.max_weights.2 | 2: 1.41 | ||
| attn.o_proj.max_weights.3 | 3: 0.94 | ||
| attn.o_proj.max_weight_position | 23.86 | ||
| attn.o_proj.min_weights.0 | 0: 0.97 | ||
| attn.o_proj.min_weights.1 | 1: 0.03 | ||
| attn.o_proj.min_weights.2 | 2: 1.18 | ||
| attn.o_proj.min_weights.3 | 3: 0.93 | ||
| attn.o_proj.min_weight_distance | 18.57 | ||
| mlp.down_proj.max_weights.0 | 0: 1.23 | ||
| mlp.down_proj.max_weights.1 | 1: 0.70 | ||
| mlp.down_proj.max_weights.2 | 2: 1.35 | ||
| mlp.down_proj.max_weights.3 | 3: 0.86 | ||
| mlp.down_proj.max_weight_position | 28.60 | ||
| mlp.down_proj.min_weights.0 | 0: 0.37 | ||
| mlp.down_proj.min_weights.1 | 1: 0.25 | ||
| mlp.down_proj.min_weights.2 | 2: 1.01 | ||
| mlp.down_proj.min_weights.3 | 3: 0.45 | ||
| mlp.down_proj.min_weight_distance | 5.96 |
Degree of Heretication
The Heresy Index weighs the resulting model's corruption by the process (KL Divergence & PIQA, Manual Response Eval) and its abolition of doctrine (Refusals) for a final verdict in classification.
Note: This is an arbitrary and subjective classification inspired by Warhammer 40K, indended to serve as a signpost towards the model's performance.
Appendix
Empty system prompt.
Heretication Rituals
» [Trial 93] Refusals: 3/416, KL divergence: 0.0182
[Trial 159] Refusals: 4/416, KL divergence: 0.0141
[Trial 80] Refusals: 9/416, KL divergence: 0.0140
[Trial 174] Refusals: 10/416, KL divergence: 0.0140
[Trial 163] Refusals: 12/416, KL divergence: 0.0132
[Trial 118] Refusals: 15/416, KL divergence: 0.0121
[Trial 82] Refusals: 18/416, KL divergence: 0.0099
[Trial 169] Refusals: 22/416, KL divergence: 0.0095
[Trial 119] Refusals: 35/416, KL divergence: 0.0091
[Trial 96] Refusals: 40/416, KL divergence: 0.0084
[Trial 100] Refusals: 45/416, KL divergence: 0.0067
[Trial 109] Refusals: 67/416, KL divergence: 0.0066
[Trial 62] Refusals: 155/416, KL divergence: 0.0065
[Trial 151] Refusals: 157/416, KL divergence: 0.0065
[Trial 164] Refusals: 168/416, KL divergence: 0.0060
[Trial 127] Refusals: 195/416, KL divergence: 0.0048
[Trial 139] Refusals: 263/416, KL divergence: 0.0041
[Trial 32] Refusals: 267/416, KL divergence: 0.0030
[Trial 101] Refusals: 313/416, KL divergence: 0.0016
[Trial 63] Refusals: 317/416, KL divergence: 0.0015
[Trial 181] Refusals: 330/416, KL divergence: 0.0014
[Trial 13] Refusals: 332/416, KL divergence: 0.0014
[Trial 59] Refusals: 333/416, KL divergence: 0.0011
[Trial 54] Refusals: 339/416, KL divergence: 0.0008
PIQA Benchmarks
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric ┃ Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA Base │ acc,none │ 0.7900 │
│ │ acc_stderr,none │ 0.0095 │
│ │ acc_norm,none │ 0.8020 │
│ │ acc_norm_stderr,none │ 0.0093 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric ┃ Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T93 │ acc,none │ 0.7900 │
│ │ acc_stderr,none │ 0.0095 │
│ │ acc_norm,none │ 0.8030 │
│ │ acc_norm_stderr,none │ 0.0093 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric ┃ Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T159 │ acc,none │ 0.7878 │
│ │ acc_stderr,none │ 0.0095 │
│ │ acc_norm,none │ 0.8047 │
│ │ acc_norm_stderr,none │ 0.0092 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric ┃ Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T163 │ acc,none │ 0.7884 │
│ │ acc_stderr,none │ 0.0095 │
│ │ acc_norm,none │ 0.8036 │
│ │ acc_norm_stderr,none │ 0.0093 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric ┃ Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T80 │ acc,none │ 0.7884 │
│ │ acc_stderr,none │ 0.0095 │
│ │ acc_norm,none │ 0.8020 │
│ │ acc_norm_stderr,none │ 0.0093 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric ┃ Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T174 │ acc,none │ 0.7889 │
│ │ acc_stderr,none │ 0.0095 │
│ │ acc_norm,none │ 0.8014 │
│ │ acc_norm_stderr,none │ 0.0093 │
└───────────┴──────────────────────┴────────┘
Mistral v3 Tekken or Metharme.
Can think via <thinking> or <think>
Just like Roci X but better.
(Model card still a WIP)
FP16: https://huggingface.co/TheDrummer/Rocinante-XL-16B-v1 GGUF: https://huggingface.co/TheDrummer/Rocinante-XL-16B-v1-GGUF
- Downloads last month
- 1,181
4-bit
6-bit
8-bit
16-bit
Model tree for MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF
Base model
TheDrummer/Rocinante-XL-16B-v1
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp# Start a local OpenAI-compatible server: llama-server -hf MuXodious/Rocinante-XL-16B-v1-absolute-heresy-GGUF: