Instructions to use mixedbread-ai/mxbai-embed-large-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mixedbread-ai/mxbai-embed-large-v1 with sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")

sentences = [
    "The weather is lovely today.",
    "It's so sunny outside!",
    "He drove to the stadium."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Transformers.js

How to use mixedbread-ai/mxbai-embed-large-v1 with Transformers.js:

// npm i @huggingface/transformers
import { pipeline } from '@huggingface/transformers';

// Allocate pipeline
const pipe = await pipeline('feature-extraction', 'mixedbread-ai/mxbai-embed-large-v1');

Transformers

How to use mixedbread-ai/mxbai-embed-large-v1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("feature-extraction", model="mixedbread-ai/mxbai-embed-large-v1")

# Load model directly
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("mixedbread-ai/mxbai-embed-large-v1")
model = AutoModel.from_pretrained("mixedbread-ai/mxbai-embed-large-v1")

llama-cpp-python

How to use mixedbread-ai/mxbai-embed-large-v1 with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="mixedbread-ai/mxbai-embed-large-v1",
	filename="gguf/mxbai-embed-large-v1-f16.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use mixedbread-ai/mxbai-embed-large-v1 with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf mixedbread-ai/mxbai-embed-large-v1:F16
# Run inference directly in the terminal:
llama-cli -hf mixedbread-ai/mxbai-embed-large-v1:F16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf mixedbread-ai/mxbai-embed-large-v1:F16
# Run inference directly in the terminal:
llama-cli -hf mixedbread-ai/mxbai-embed-large-v1:F16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf mixedbread-ai/mxbai-embed-large-v1:F16
# Run inference directly in the terminal:
./llama-cli -hf mixedbread-ai/mxbai-embed-large-v1:F16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf mixedbread-ai/mxbai-embed-large-v1:F16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf mixedbread-ai/mxbai-embed-large-v1:F16

Use Docker

docker model run hf.co/mixedbread-ai/mxbai-embed-large-v1:F16

LM Studio
Jan
Ollama
How to use mixedbread-ai/mxbai-embed-large-v1 with Ollama:
```
ollama run hf.co/mixedbread-ai/mxbai-embed-large-v1:F16
```

Unsloth Studio

How to use mixedbread-ai/mxbai-embed-large-v1 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for mixedbread-ai/mxbai-embed-large-v1 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for mixedbread-ai/mxbai-embed-large-v1 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for mixedbread-ai/mxbai-embed-large-v1 to start chatting

Atomic Chat new
Docker Model Runner
How to use mixedbread-ai/mxbai-embed-large-v1 with Docker Model Runner:
```
docker model run hf.co/mixedbread-ai/mxbai-embed-large-v1:F16
```

Lemonade

How to use mixedbread-ai/mxbai-embed-large-v1 with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull mixedbread-ai/mxbai-embed-large-v1:F16

Run and chat with the model

lemonade run user.mxbai-embed-large-v1-F16

List all available models

lemonade list

Not able to load locally downloaded model using SentenceTransformer

#16

by umesh-c - opened Aug 28, 2024

Discussion

umesh-c

Aug 28, 2024

My simple code to load the model using locally downloaded path:

from sentence_transformers import SentenceTransformer
self.model = SentenceTransformer(self.model_path)

model_path is : models/downloads/mxbai-embed-large-v1

ll models/downloads/mxbai-embed-large-v1                                                                                      
total 7248200
drwxr-xr-x  3 umesh  staff    96B Aug 28 15:43 1_Pooling
-rw-r--r--  1 umesh  staff    11K Aug 28 15:43 LICENSE
-rw-r--r--  1 umesh  staff   111K Aug 28 15:43 README.md
-rw-r--r--  1 umesh  staff   677B Aug 28 15:43 config.json
-rw-r--r--  1 umesh  staff   171B Aug 28 15:43 config_sentence_transformers.json
drwxr-xr-x  3 umesh  staff    96B Aug 28 15:43 gguf
-rw-r--r--  1 umesh  staff   1.2G Aug 28 15:45 model.onnx
-rw-r--r--  1 umesh  staff   639M Aug 28 15:44 model.safetensors
-rw-r--r--  1 umesh  staff   638M Aug 28 15:46 model_fp16.onnx
-rw-r--r--  1 umesh  staff   321M Aug 28 15:46 model_quantized.onnx
-rw-r--r--  1 umesh  staff   229B Aug 28 15:43 modules.json
-rw-r--r--  1 umesh  staff   639M Aug 28 15:44 mxbai-embed-large-v1-f16.gguf
drwxr-xr-x  5 umesh  staff   160B Aug 28 15:43 onnx
-rw-r--r--  1 umesh  staff    53B Aug 28 15:43 sentence_bert_config.json
-rw-r--r--  1 umesh  staff   695B Aug 28 15:43 special_tokens_map.json
-rw-r--r--  1 umesh  staff   695K Aug 28 15:43 tokenizer.json
-rw-r--r--  1 umesh  staff   1.2K Aug 28 15:43 tokenizer_config.json
-rw-r--r--  1 umesh  staff   226K Aug 28 15:43 vocab.txt

Lib versions are as follows :

Python 3.11.9
sentence-transformers                   3.0.1
torch                                   2.4.0

The error which I am getting with below thread dump:

Fatal Python error: Aborted

Current thread 0x00000001f59e2500 (most recent call first):
  File "/Users/umesh/git-repos/genai_search/cxg/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1160 in convert
  File "/Users/umesh/git-repos/genai_search/cxg/lib/python3.11/site-packages/torch/nn/modules/module.py", line 805 in _apply
  File "/Users/umesh/git-repos/genai_search/cxg/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780 in _apply
  File "/Users/umesh/git-repos/genai_search/cxg/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780 in _apply
  File "/Users/umesh/git-repos/genai_search/cxg/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780 in _apply
  File "/Users/umesh/git-repos/genai_search/cxg/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780 in _apply
  File "/Users/umesh/git-repos/genai_search/cxg/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1174 in to
  File "/Users/umesh/git-repos/genai_search/cxg/lib/python3.11/site-packages/sentence_transformers/SentenceTransformer.py", line 316 in __init__
  File "/Users/umesh/git-repos/genai_search/models/impl/mxbai_embed_large_v1.py", line 13 in __init__

Clueless what might be wrong as I am doing same as mentioned in the README. Can someone point the possible issue?
BTW, I am able to load the bge-large-en-v1.5 model just fine using SentenceTransformer.

Thanks!

abdulwaheed

Oct 28, 2024

Facing same issue, were you able to find the solution?

umesh-c

Oct 28, 2024

Try to use SentenceTransformer like this:

        # setting device as GPU if available, else CPU
        device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        self.model = SentenceTransformer(self.model_path, device=device)

abdulwaheed

Oct 28, 2024

Thanks for reply, I am using machine with no internet access; I think it is trying to download something.
Getting this error.
Cannot find an appropriate cached snapshot folder for the specified revision on the local disk and outgoing traffic has been disabled. To enable repo look-ups and downloads online

aamirshakir changed discussion status to closed Jan 23

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment