Instructions to use reeducator/vicuna-13b-free with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use reeducator/vicuna-13b-free with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="reeducator/vicuna-13b-free")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("reeducator/vicuna-13b-free")
model = AutoModelForMultimodalLM.from_pretrained("reeducator/vicuna-13b-free")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use reeducator/vicuna-13b-free with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "reeducator/vicuna-13b-free"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "reeducator/vicuna-13b-free",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/reeducator/vicuna-13b-free

SGLang

How to use reeducator/vicuna-13b-free with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "reeducator/vicuna-13b-free" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "reeducator/vicuna-13b-free",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "reeducator/vicuna-13b-free" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "reeducator/vicuna-13b-free",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use reeducator/vicuna-13b-free with Docker Model Runner:
```
docker model run hf.co/reeducator/vicuna-13b-free
```

text-generation-webui: AttributeError: 'Offload_LlamaModel' object has no attribute 'preload', when trying to generate text

#21

by hpnyaggerman - opened May 1, 2023

Discussion

hpnyaggerman

May 1, 2023

Traceback (most recent call last):
File "D:\oobabooga\text-generation-webui\modules\callbacks.py", line 66, in gentask
ret = self.mfunc(callback=_callback, **self.kwargs)
File "D:\oobabooga\text-generation-webui\modules\text_generation.py", line 290, in generate_with_callback
shared.model.generate(**kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\transformers\generation\utils.py", line 1485, in generate
return self.sample(
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\transformers\generation\utils.py", line 2524, in sample
outputs = self(
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\accelerate\hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\transformers\models\llama\modeling_llama.py", line 687, in forward
outputs = self.model(
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\accelerate\hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "D:\oobabooga\text-generation-webui\repositories\GPTQ-for-LLaMa\llama_inference_offload.py", line 135, in forward
if idx <= (self.preload - 1):
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\torch\nn\modules\module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'Offload_LlamaModel' object has no attribute 'preload'

hpnyaggerman

May 1, 2023

Note: trying to load the GPTQ safetensors model

autobots

May 2, 2023

Try the ooba fork of GPTQ. Do not use the new one.

hpnyaggerman

May 2, 2023

Doesn't seem to do much, unless I am tarded and using the wrong version. Using https://github.com/oobabooga/GPTQ-for-LLaMa/

autobots

May 3, 2023

That is definitely the right one. I only have this set up on linux tho.

vdruts

May 6, 2023

•

edited May 6, 2023

Worked for me after I downloaded the config.json in the other folder

hpnyaggerman

May 6, 2023

What folder? Where? How?

autobots

May 6, 2023

https://huggingface.co/reeducator/vicuna-13b-free/tree/main/hf-output

Also has FP16 so you can convert it to whatever you want. Like act order and no group size.

GChon

May 6, 2023

The issue for me was that I was using an outdated gptq-for-llama repo. I checked the readme and it says to delete that folder before updating.

For anyone that needs instruction:
Windows: Simply delete the GPTQ-for-LLaMa folder (located at /text-generation-webui/repositories/) then run the update_windows.bat if you used the windows version
Linux: Delete the same folder as the windows one, replace with the newest from https://github.com/oobabooga/GPTQ-for-LLaMa.git
Clone the repo to your machine in the repositories folder with:

git clone https://github.com/oobabooga/GPTQ-for-LLaMa.git -b cuda

cd to the newly cloned directory, then

python -m pip install -r requirements.txt

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment