view article Article Train AI models with Unsloth and Hugging Face Jobs for FREE +4 12 days ago โข 80
Running Featured 57 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems ๐ 57 Who needs 1T parameters? Olympiad proofs with a 4B model
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper โข 2510.14528 โข Published Oct 16, 2025 โข 118
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA +3 May 24, 2023 โข 175
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 โข 120
๐ Research & Long-Form Blog Posts Collection In-depth technical articles and research pieces published by Hugging Face โข 11 items โข Updated 15 days ago โข 21
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper โข 2512.20848 โข Published Dec 23, 2025 โข 38
view post Post 5551 NVIDIA releases Nemotron 3 Nano, a new 30B hybrid reasoning model! ๐ฅHas 1M context window & best in class performance for SWE-Bench, reasoning & chat. Run the MoE model locally with 24GB RAM.GGUF: unsloth/Nemotron-3-Nano-30B-A3B-GGUF๐ Step-by-step Guide: https://docs.unsloth.ai/models/nemotron-3 See translation 1 reply ยท ๐ฅ 13 13 โค๏ธ 7 7 ๐ค 4 4 ๐ 1 1 + Reply
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 โข 302
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15, 2025 โข 226
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face +3 Jul 29, 2025 โข 215
Running 3.72k The Ultra-Scale Playbook ๐ 3.72k The ultimate guide to training LLM on large GPU Clusters
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models Paper โข 2410.10733 โข Published Oct 14, 2024 โข 9