Instructions to use sanchitahuja205/xelm-gemma-4b-slavic-freeze with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sanchitahuja205/xelm-gemma-4b-slavic-freeze with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="sanchitahuja205/xelm-gemma-4b-slavic-freeze")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("sanchitahuja205/xelm-gemma-4b-slavic-freeze") model = AutoModelForMultimodalLM.from_pretrained("sanchitahuja205/xelm-gemma-4b-slavic-freeze") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use sanchitahuja205/xelm-gemma-4b-slavic-freeze with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "sanchitahuja205/xelm-gemma-4b-slavic-freeze" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sanchitahuja205/xelm-gemma-4b-slavic-freeze", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/sanchitahuja205/xelm-gemma-4b-slavic-freeze
- SGLang
How to use sanchitahuja205/xelm-gemma-4b-slavic-freeze with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "sanchitahuja205/xelm-gemma-4b-slavic-freeze" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sanchitahuja205/xelm-gemma-4b-slavic-freeze", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "sanchitahuja205/xelm-gemma-4b-slavic-freeze" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sanchitahuja205/xelm-gemma-4b-slavic-freeze", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use sanchitahuja205/xelm-gemma-4b-slavic-freeze with Docker Model Runner:
docker model run hf.co/sanchitahuja205/xelm-gemma-4b-slavic-freeze
xelm-gemma-4b-slavic-freeze
Layer-freezing strategy: middle transformer layers are frozen at the base Gemma-3-4B weights; only the first and last layers are updated during CPT. Mitigates catastrophic forgetting of general capabilities.
- Base model: google/gemma-3-4b-pt
- Strategy:
freeze - Language family: Slavic
- Code: https://github.com/sanchit-ahuja/scaling-multilingual-experts
Loading
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("sanchitahuja205/xelm-gemma-4b-slavic-freeze")
tokenizer = AutoTokenizer.from_pretrained("sanchitahuja205/xelm-gemma-4b-slavic-freeze")
Training recipe
The exact training recipe lives in configs/yaml/train_gemma_freeze.yaml in the code repo. The resolved config used for this specific run is also included in this model repo as training_config.yaml — load it with pyrallis to reproduce the run bit-for-bit:
python train.py --config_path configs/yaml/train_gemma_freeze.yaml
Citation
@misc{ahuja2026parameteralignmentmitigatescatastrophic,
title={Parameter Alignment Mitigates Catastrophic Forgetting in Multilingual Expert Language Models},
author={Sanchit Ahuja and Terra Blevins},
year={2026},
eprint={2606.00284},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2606.00284},
}
- Downloads last month
- 15
Model tree for sanchitahuja205/xelm-gemma-4b-slavic-freeze
Base model
google/gemma-3-4b-pt