Instructions to use Tesslate/UIGEN-T2-7B-Q8_0-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Tesslate/UIGEN-T2-7B-Q8_0-GGUF with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Tesslate/UIGEN-T2-7B-Q8_0-GGUF") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Tesslate/UIGEN-T2-7B-Q8_0-GGUF", dtype="auto") - llama-cpp-python
How to use Tesslate/UIGEN-T2-7B-Q8_0-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Tesslate/UIGEN-T2-7B-Q8_0-GGUF", filename="uigen-t2-7b-3600-q8_0.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use Tesslate/UIGEN-T2-7B-Q8_0-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0 # Run inference directly in the terminal: llama-cli -hf Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0 # Run inference directly in the terminal: llama-cli -hf Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0 # Run inference directly in the terminal: ./llama-cli -hf Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0
Use Docker
docker model run hf.co/Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0
- LM Studio
- Jan
- vLLM
How to use Tesslate/UIGEN-T2-7B-Q8_0-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Tesslate/UIGEN-T2-7B-Q8_0-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Tesslate/UIGEN-T2-7B-Q8_0-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0
- SGLang
How to use Tesslate/UIGEN-T2-7B-Q8_0-GGUF with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Tesslate/UIGEN-T2-7B-Q8_0-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Tesslate/UIGEN-T2-7B-Q8_0-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Tesslate/UIGEN-T2-7B-Q8_0-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Tesslate/UIGEN-T2-7B-Q8_0-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use Tesslate/UIGEN-T2-7B-Q8_0-GGUF with Ollama:
ollama run hf.co/Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0
- Unsloth Studio
How to use Tesslate/UIGEN-T2-7B-Q8_0-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Tesslate/UIGEN-T2-7B-Q8_0-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Tesslate/UIGEN-T2-7B-Q8_0-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Tesslate/UIGEN-T2-7B-Q8_0-GGUF to start chatting
- Pi
How to use Tesslate/UIGEN-T2-7B-Q8_0-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Tesslate/UIGEN-T2-7B-Q8_0-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use Tesslate/UIGEN-T2-7B-Q8_0-GGUF with Docker Model Runner:
docker model run hf.co/Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0
- Lemonade
How to use Tesslate/UIGEN-T2-7B-Q8_0-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Tesslate/UIGEN-T2-7B-Q8_0-GGUF:Q8_0
Run and chat with the model
lemonade run user.UIGEN-T2-7B-Q8_0-GGUF-Q8_0
List all available models
lemonade list
Model Card for UIGEN-T2-7B-GGUF
Model Overview
We're excited to introduce UIGEN-T2, the next evolution in our UI generation model series. Fine-tuned from the highly capable Qwen2.5-Coder-7B-Instruct base model using PEFT/LoRA, UIGEN-T2 is specifically designed to generate HTML and Tailwind CSS code for web interfaces. What sets UIGEN-T2 apart is its training on a massive 50,000 sample dataset (up from 400) and its unique UI-based reasoning capability, allowing it to generate not just code, but code informed by thoughtful design principles.
Model Highlights
- High-Quality UI Code Generation: Produces functional and semantic HTML combined with utility-first Tailwind CSS.
- Massive Training Dataset: Trained on 50,000 diverse UI examples, enabling broader component understanding and stylistic range.
- Innovative UI-Based Reasoning: Incorporates detailed reasoning traces generated by a specialized "teacher" model, ensuring outputs consider usability, layout, and aesthetics. (See example reasoning in description below)
- PEFT/LoRA Trained (Rank 128): Efficiently fine-tuned for UI generation. We've published LoRA checkpoints at each training step for transparency and community use!
- Improved Chat Interaction: Streamlined prompt flow โ no more need for the awkward double
thinkprompt! Interaction feels more natural.
Example Reasoning (Internal Guide for Generation)
Here's a glimpse into the kind of reasoning that guides UIGEN-T2 internally, generated by our specialized teacher model:
<|begin_of_thought|>
When approaching the challenge of crafting an elegant stopwatch UI, my first instinct is to dissect what truly makes such an interface delightful yet functionalโhence, I consider both aesthetic appeal and usability grounded in established heuristics like Nielsenโs โaesthetic and minimalist designโ alongside Gestalt principles... placing the large digital clock prominently aligns with Fittsโ Law... The glassmorphism effect here enhances visual separation... typography choicesโthe use of a monospace font family ("Fira Code" via Google Fonts) supports readability... iconography paired with labels inside buttons provides dual coding... Tailwind CSS v4 enables utility-driven consistency... critical reflection concerns responsiveness: flexbox layouts combined with relative sizing guarantee graceful adaptation...
<|end_of_thought|>
Example Outputs
Use Cases
Recommended Uses
- Rapid UI Prototyping: Quickly generate HTML/Tailwind code snippets from descriptions or wireframes.
- Component Generation: Create standard and custom UI components (buttons, cards, forms, layouts).
- Frontend Development Assistance: Accelerate development by generating baseline component structures.
- Design-to-Code Exploration: Bridge the gap between design concepts and initial code implementation.
Limitations
- Current Framework Focus: Primarily generates HTML and Tailwind CSS. (Bootstrap support is planned!).
- Complex JavaScript Logic: Focuses on structure and styling; dynamic behavior and complex state management typically require manual implementation.
- Highly Specific Design Systems: May need further fine-tuning for strict adherence to unique, complex corporate design systems.
How to Use
You have to use this system prompt:
You are Tesslate, a helpful assistant specialized in UI generation.
These are the reccomended parameters: 0.7 Temp, Top P 0.9.
Inference Example
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Make sure you have PEFT installed: pip install peft
from peft import PeftModel
# Use your specific model name/path once uploaded
model_name_or_path = "tesslate/UIGEN-T2" # Placeholder - replace with actual HF repo name
base_model_name = "Qwen/Qwen2.5-Coder-7B-Instruct"
# Load the base model
base_model = AutoModelForCausalLM.from_pretrained(
base_model_name,
torch_dtype=torch.bfloat16, # or float16 if bf16 not supported
device_map="auto"
)
# Load the PEFT model (LoRA weights)
model = PeftModel.from_pretrained(base_model, model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(base_model_name) # Use base tokenizer
# Note the simplified prompt structure (no double 'think')
prompt = """<|im_start|>user
Create a simple card component using Tailwind CSS with an image, title, and description.<|im_end|>
<|im_start|>assistant
""" # Model will generate reasoning and code following this
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Adjust generation parameters as needed
outputs = model.generate(**inputs, max_new_tokens=1024, do_sample=True, temperature=0.6, top_p=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Performance and Evaluation
- Strengths:
- Generates semantically correct and well-structured HTML/Tailwind CSS.
- Leverages a large dataset (50k samples) for improved robustness and diversity.
- Incorporates design reasoning for more thoughtful UI outputs.
- Improved usability via streamlined chat template.
- Openly published LoRA checkpoints for community use.
- Weaknesses:
- Currently limited to HTML/Tailwind CSS (Bootstrap planned).
- Complex JavaScript interactivity requires manual implementation.
- Reinforcement Learning refinement (for stricter adherence to principles/rewards) is a future step.
Technical Specifications
- Architecture: Transformer-based LLM adapted with PEFT/LoRA
- Base Model: Qwen/Qwen2.5-Coder-7B-Instruct
- Adapter Rank (LoRA): 128
- Training Data Size: 50,000 samples
- Precision: Trained using bf16/fp16. Base model requires appropriate precision handling.
- Hardware Requirements: Recommend GPU with >= 16GB VRAM for efficient inference (depends on quantization/precision).
- Software Dependencies:
- Hugging Face Transformers (
transformers) - PyTorch (
torch) - Parameter-Efficient Fine-Tuning (
peft)
- Hugging Face Transformers (
Citation
If you use UIGEN-T2 or the LoRA checkpoints in your work, please cite us:
@misc{tesslate_UIGEN-T2,
title={UIGEN-T2: Scaling UI Generation with Reasoning on Qwen2.5-Coder-7B},
author={tesslate},
year={2024}, # Adjust year if needed
publisher={Hugging Face},
url={https://huggingface.co/tesslate/UIGEN-T2} # Placeholder URL
}
Contact & Community
- Downloads last month
- 13
8-bit
Model tree for Tesslate/UIGEN-T2-7B-Q8_0-GGUF
Base model
Qwen/Qwen2.5-7B




