gg-hf-tg

community
Activity Feed

AI & ML interests

None defined yet.

danielhanchen 
posted an update 5 days ago
danielhanchen 
posted an update 7 days ago
danielhanchen 
posted an update 12 days ago
alvarobartt 
posted an update 24 days ago
view post
Post
363
Open agents on AWS SageMaker AI with open models from the Hugging Face Hub!

> Deploy an open model from the Hugging Face Hub on SageMaker AI
> Connect the deployed model to Strands Agents
> Add built-in and custom tools for tool calling
> Expose external capabilities through MCP integration
> Bonus: talk to your agent and visualize traces with Gradio

https://alvarobartt.com/agents-on-aws-sagemaker
danielhanchen 
posted an update 25 days ago
alvarobartt 
posted an update 28 days ago
view post
Post
3311
Latest hf-mem release added a breakdown of Mixture-of-Experts (MoE) memory usage!

TL; DR MoEs can be misleading to reason about from active parameters alone, since each token only activates a subset of experts, while the serving setup still needs to account for the full resident memory footprint.

🧠 hf-mem now splits MoE memory into base model weights, routed experts, and KV cache
🏗️ Dense models usually load and use most weights every forward pass, while MoEs load many experts but only route each token to a few of them
⚡ Active params isn't the same as memory footprint, especially for sparse architectures
📦 Runtime memory is about what is used per request/token, while loading memory also includes the expert weights that need to be resident
📚 KV cache can still dominate depending on context length, batch size, and concurrency
🔀 Expert Parallelism (EP) helps shard experts across accelerators when expert weights dominate
🚀 Data Parallelism (DP) + EP is often a good fit for throughput-oriented MoE serving

Check the repository at https://github.com/alvarobartt/hf-mem
danielhanchen 
posted an update about 1 month ago
view post
Post
5853
We’re excited to announce that Unsloth has joined the PyTorch Ecosystem! 🔥🦥

Unsloth is an open-source project that makes training & running models more accurate and faster with less compute. Our mission is to make local AI accessible to everyone. Thanks to all of you for making this possible! 💕

Blog: https://unsloth.ai/blog/pytorch
GitHub: https://github.com/unslothai/unsloth
  • 2 replies
·
danielhanchen 
posted an update about 1 month ago
view post
Post
7733
We collaborated with NVIDIA to teach you how we made LLM training ~25% faster! 🚀

Learn how 3 optimizations help your home GPU train models faster:
1. Packed-sequence metadata caching
2. Double-buffered checkpoint reloads
3. Faster MoE routing

Guide: https://unsloth.ai/blog/nvidia-collab
GitHub: https://github.com/unslothai/unsloth
danielhanchen 
posted an update about 1 month ago
view post
Post
8875
We made a guide on how to run open LLMs in Claude Code, Codex and OpenClaw.

Use Gemma 4 and Qwen3.6 GGUFs for local agentic coding on 24GB RAM

Run with self-healing tool calls, code execution, web search via the Unsloth API endpoint and llama.cpp

Guide: https://unsloth.ai/docs/basics/api
danielhanchen 
posted an update about 2 months ago
view post
Post
10829
Unsloth is now one of the top 10 most followed organizations on Hugging Face. 🤗🦥

Thanks so much for all the support!
Our HF page:
unsloth
  • 5 replies
·
danielhanchen 
posted an update about 2 months ago
danielhanchen 
posted an update about 2 months ago
danielhanchen 
posted an update 2 months ago
danielhanchen 
posted an update 2 months ago
danielhanchen 
posted an update 3 months ago
view post
Post
2803
A new way to use Unsloth.

Coming soon...
danielhanchen 
posted an update 3 months ago
view post
Post
949
You don’t need to set LLM parameters anymore! 🚀

llama.cpp uses only the context length + compute your local setup needs. Unsloth also auto-applies the correct model settings

Try in Unsloth Studio - now with precompiled llama.cpp binaries.

GitHub: https://github.com/unslothai/unsloth
  • 2 replies
·
danielhanchen 
posted an update 3 months ago
view post
Post
3442
Introducing Unsloth Studio ✨
A new open-source web UI to train and run LLMs.

• Run models locally on Mac, Windows, Linux
• Train 500+ models 2x faster with 70% less VRAM
• Supports GGUF, vision, audio, embedding models
• Auto-create datasets from PDF, CSV, DOCX
• Self-healing tool calling and code execution
• Compare models side by side + export to GGUF

GitHub: https://github.com/unslothai/unsloth
Blog and Guide: https://unsloth.ai/docs/new/studio

Available now on Hugging Face, NVIDIA, Docker and Colab.
danielhanchen 
posted an update 3 months ago
view post
Post
3944
We collaborated with NVIDIA to teach you about Reinforcement Learning and RL environments. 💚 Learn:

• Why RL environments matter + how to build them
• When RL is better than SFT
• GRPO and RL best practices
• How verifiable rewards and RLVR work

Blog: https://unsloth.ai/blog/rl-environments
  • 4 replies
·
alvarobartt 
posted an update 3 months ago
view post
Post
3749
Learn how to deploy Microsoft Research VibeVoice ASR on Microsoft Azure Foundry with Hugging Face to generate rich audio transcriptions with Who, When, and What! 💥

> 🕒 60-minute single-pass processing, no chunking or stitching
> 👤 Customized hotwords to guide recognition on domain-specific content
> 📝 Rich transcription: joint ASR + diarization + timestamping in one pass
> 🌍 50+ languages with automatic detection and code-switching support
> 🤗 Deployed on Microsoft Foundry via an OpenAI-compatible Chat Completions API

https://huggingface.co/docs/microsoft-azure/foundry/examples/deploy-vibevoice-asr
danielhanchen 
posted an update 4 months ago
view post
Post
3466
100,000+ models trained with Unsloth have now been open-sourced on 🤗Hugging Face! 🦥

Here are the most popular ones you can run local:
1. TeichAI - GLM-4.7-Flash distilled from Claude 4.5 Opus (high)
2. Zed - Qwen Coder 7B fine-tuned for stronger coding
3. DavidAU - Llama-3.3-8B distilled from Claude 4.5 Opus (high)
4. huihui - gpt-oss made “abliberated”

Links to models:
1. TeichAI: TeichAI/GLM-4.7-Flash-Claude-Opus-4.5-High-Reasoning-Distill-GGUF
2. Zed: zed-industries/zeta
3. DavidAU: DavidAU/Llama3.3-8B-Instruct-Thinking-Claude-4.5-Opus-High-Reasoning
4. huihui: huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated

See all the 100K latest models fine-tuned with Unsloth here: https://huggingface.co/models?other=u
  • 2 replies
·