VLLMs LLaVA-OneVision: Easy Visual Task Transfer Paper • 2408.03326 • Published Aug 6, 2024 • 61 MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Paper • 2408.02718 • Published Aug 5, 2024 • 62 NVLM: Open Frontier-Class Multimodal LLMs Paper • 2409.11402 • Published Sep 17, 2024 • 74
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Paper • 2408.02718 • Published Aug 5, 2024 • 62
VLLMs LLaVA-OneVision: Easy Visual Task Transfer Paper • 2408.03326 • Published Aug 6, 2024 • 61 MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Paper • 2408.02718 • Published Aug 5, 2024 • 62 NVLM: Open Frontier-Class Multimodal LLMs Paper • 2409.11402 • Published Sep 17, 2024 • 74
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Paper • 2408.02718 • Published Aug 5, 2024 • 62
rdzotz/w2v-bert-2.0-mongolian-colab-CV16.0 Automatic Speech Recognition • 0.6B • Updated Jan 29, 2024