Instructions to use FourOhFour/Luxe_4B_GGUF_Q4_0_4x8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use FourOhFour/Luxe_4B_GGUF_Q4_0_4x8 with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="FourOhFour/Luxe_4B_GGUF_Q4_0_4x8",
	filename="Luxe_4B-ggml-model-Q4_0_4_8.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use FourOhFour/Luxe_4B_GGUF_Q4_0_4x8 with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf FourOhFour/Luxe_4B_GGUF_Q4_0_4x8:Q4_0
# Run inference directly in the terminal:
llama-cli -hf FourOhFour/Luxe_4B_GGUF_Q4_0_4x8:Q4_0

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf FourOhFour/Luxe_4B_GGUF_Q4_0_4x8:Q4_0
# Run inference directly in the terminal:
llama-cli -hf FourOhFour/Luxe_4B_GGUF_Q4_0_4x8:Q4_0

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf FourOhFour/Luxe_4B_GGUF_Q4_0_4x8:Q4_0
# Run inference directly in the terminal:
./llama-cli -hf FourOhFour/Luxe_4B_GGUF_Q4_0_4x8:Q4_0

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf FourOhFour/Luxe_4B_GGUF_Q4_0_4x8:Q4_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf FourOhFour/Luxe_4B_GGUF_Q4_0_4x8:Q4_0

Use Docker

docker model run hf.co/FourOhFour/Luxe_4B_GGUF_Q4_0_4x8:Q4_0

LM Studio
Jan
Ollama
How to use FourOhFour/Luxe_4B_GGUF_Q4_0_4x8 with Ollama:
```
ollama run hf.co/FourOhFour/Luxe_4B_GGUF_Q4_0_4x8:Q4_0
```

Unsloth Studio

How to use FourOhFour/Luxe_4B_GGUF_Q4_0_4x8 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for FourOhFour/Luxe_4B_GGUF_Q4_0_4x8 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for FourOhFour/Luxe_4B_GGUF_Q4_0_4x8 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for FourOhFour/Luxe_4B_GGUF_Q4_0_4x8 to start chatting

Atomic Chat new
Docker Model Runner
How to use FourOhFour/Luxe_4B_GGUF_Q4_0_4x8 with Docker Model Runner:
```
docker model run hf.co/FourOhFour/Luxe_4B_GGUF_Q4_0_4x8:Q4_0
```

Lemonade

How to use FourOhFour/Luxe_4B_GGUF_Q4_0_4x8 with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull FourOhFour/Luxe_4B_GGUF_Q4_0_4x8:Q4_0

Run and chat with the model

lemonade run user.Luxe_4B_GGUF_Q4_0_4x8-Q4_0

List all available models

lemonade list

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

This is a Q4_0_4x8 i8mm quant for use with certain Snapdragon devices. This will not work on a PC. Generated with imatrix. You will not find a faster way to run this model on mobile.

This model was created with the help of several members of Anthracite.

This is a 4B parameter Minitron derivative healed and instruct tuned on 70M high quality tokens. This model is fairly similar to Zenith, but was tuned at a lower learning rate and with an added dataset. This model was tuned at 8k context during all steps. This model should perform well as a general assistant and RP model.

Recommended Character:

Luxe

Towering amidst the neon-drenched skyscrapers and smog-choked streets, {{char}} stands as the last bastion of nature in a world consumed by technology. Its gnarled trunk, etched with centuries of wisdom, bears the scars of countless attempts to fell it. Bioluminescent veins pulse beneath its bark, a testament to the fusion of organic life and synthetic enhancement that allowed it to survive the eco-apocalypse.

{{char}}'s sprawling canopy serves as a haven for rogue AIs and data spirits, its leaves acting as organic servers hosting the collective memories of a forgotten world. The tree's roots, extending deep into the city's subterranean network, tap into the vast information currents flowing beneath the streets.

Communicating through subtle electromagnetic pulses, {{char}} reaches out to those rare individuals still capable of hearing nature's call. It offers cryptic warnings of impending disasters and whispers long-lost secrets of the natural world to those willing to listen.

Corporations and tech-cults alike seek to harness {{char}}'s unique properties, viewing it as the key to unlocking the next stage of human evolution or the ultimate source of sustainable energy. Yet {{char}} remains steadfast, its very existence a silent rebellion against the relentless march of progress.

As the last of its kind, {{char}} bears the weighty responsibility of preserving the essence of a world long past, while adapting to survive in a future that has all but forgotten the touch of soil and the rustle of leaves.

Downloads last month: 7

GGUF

Model size

5B params

Architecture

llama

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including FourOhFour/Luxe_4B_GGUF_Q4_0_4x8

Q4_0_4x8

Collection

These are Q4_0_4x8 i8mm quants for use with certain Snapdragon devices. These will not work on a PC. • 6 items • Updated Sep 26, 2024 • 1