hustvl/InfiniteVL-LongSFT
Image-Text-to-Text
•
4B
•
Updated
•
67
•
2
None defined yet.
DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models