view article Article I Let a Lobster Run My Jetson: What OpenClaw Taught Me About the Future of Computing andito • Feb 19 • 16
view article Article Streaming datasets: 100x More Efficient +3 andito, lhoestq, burtenshaw, pcuenq, merve • Oct 27, 2025 • 86
view article Article Supercharge your OCR Pipelines with Open Models +5 merve, ariG23498, davanstrien, hynky, andito, reach-vb, pcuenq • Oct 21, 2025 • 309
view article Article TimeScope: How Long Can Your Video Large Multimodal Model Go? +2 orrzohar, ruili0, andito, nicholswang • Jul 23, 2025 • 48
view article Article Efficient MultiModal Data Pipeline +3 ariG23498, lusxvr, andito, sergiopaniego, pcuenq • Jul 8, 2025 • 70
view article Article KV Cache from scratch in nanoVLM +3 ariG23498, kashif, lusxvr, andito, pcuenq • Jun 4, 2025 • 119
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data +7 danaaubakirova, andito, merve, ariG23498, fracapuano, loubnabnl, pcuenq, mshukor, cadene • Jun 3, 2025 • 346
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 ariG23498, lusxvr, andito, sergiopaniego, merve, pcuenq, reach-vb • May 21, 2025 • 257
view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 611
view article Article SmolVLM2: Bringing Video Understanding to Every Device +5 orrzohar, mfarre, andito, merve, pcuenq, cyrilzakka, Xenova • Feb 20, 2025 • 337
view article Article SmolVLM2: Bringing Video Understanding to Every Device +5 orrzohar, mfarre, andito, merve, pcuenq, cyrilzakka, Xenova • Feb 20, 2025 • 337
view article Article SmolVLM Grows Smaller – Introducing the 256M & 500M Models! +1 andito, mfarre, merve • Jan 23, 2025 • 192
view article Article SmolVLM - small yet mighty Vision Language Model +3 andito, merve, mfarre, eliebak, pcuenq • Nov 26, 2024 • 417
view article Article Deploying Speech-to-Speech on Hugging Face +2 andito, derek-thomas, dmaniloff, eustlb • Oct 22, 2024 • 45
view article Article FineVideo: behind the scenes +4 mfarre, andito, lewtun, lvwerra, pcuenq, thomwolf • Sep 23, 2024 • 35
view article Article LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning? danaaubakirova, andito • Jul 25, 2024 • 17
view article Article LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning? danaaubakirova, andito • Jul 25, 2024 • 17
view article Article Docmatix - a huge dataset for Document Visual Question Answering andito, HugoLaurencon • Jul 18, 2024 • 78
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models +1 andito, merve, SkalskiP • Jun 24, 2024 • 207