Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding Paper • 2601.10611 • Published Jan 15 • 29
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models Paper • 2601.21639 • Published Jan 29 • 50
Running on Zero Featured 1.64k Qwen3-TTS Demo 🎙 1.64k Generate custom speech from text, voice descriptions, or samples