view post Post 10170 deepseek-ai/DeepSeek-OCR is out! 🔥 my take ⤵️ > pretty insane it can parse and re-render charts in HTML> it uses CLIP and SAM features concatenated, so better grounding> very efficient per vision tokens/performance ratio> covers 100 languages See translation
super cool vision language datasets ServiceNow/ui-vision Viewer • Updated May 7, 2025 • 1.46k • 3.39k • 20 xxxllz/Chart2Code-160k Updated Jul 7, 2025 • 84 • 10 ReCAP-Agent/ReCAP-187k-SFT Viewer • Updated about 20 hours ago • 188k • 25 • 5 allenai/MolmoPoint-GUISyn Viewer • Updated 9 days ago • 37k • 564 • 10
Multimodal tool calling datasets AgoraX/OpenImage-FNCall-50k Viewer • Updated Feb 14, 2024 • 53.3k • 51 • 3 ScaleAI/VisualToolBench Viewer • Updated Dec 16, 2025 • 1.2k • 4.49k • 4 internlm/ARM-Thinker-Data Preview • Updated Feb 13 • 46 • 7
super cool vision language datasets ServiceNow/ui-vision Viewer • Updated May 7, 2025 • 1.46k • 3.39k • 20 xxxllz/Chart2Code-160k Updated Jul 7, 2025 • 84 • 10 ReCAP-Agent/ReCAP-187k-SFT Viewer • Updated about 20 hours ago • 188k • 25 • 5 allenai/MolmoPoint-GUISyn Viewer • Updated 9 days ago • 37k • 564 • 10
Multimodal tool calling datasets AgoraX/OpenImage-FNCall-50k Viewer • Updated Feb 14, 2024 • 53.3k • 51 • 3 ScaleAI/VisualToolBench Viewer • Updated Dec 16, 2025 • 1.2k • 4.49k • 4 internlm/ARM-Thinker-Data Preview • Updated Feb 13 • 46 • 7
Running on CPU Upgrade 18 Daggr Image To 3d 👀 Convert images into 3D assets with background removal and enhancement
Running on Zero Featured 112 SAM3 Video Segmentation 🐠 Track and label objects in videos using text prompts or clicks