A lightweight explicit alignment recipe that adapts off-the-shelf VLMs into robust omni-modal embedding models. https://arxiv.org/abs/2601.03666
Haonan Chen
Haon-Chen
AI & ML interests
None yet
Recent Activity
new activity 2 days ago
Haon-Chen/e5-omni-7B:Integrate with Sentence Transformers v5.4 new activity 2 days ago
Haon-Chen/e5-omni-3B:Integrate with Sentence Transformers v5.4 upvoted a paper about 2 months ago
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual HistoriesOrganizations
Vidore-v2-full
SPEED
Aligned embedding data synthesis models and embedding model. Our paper: https://arxiv.org/pdf/2410.18634
MoCa
HomePage: https://haon-chen.github.io/MoCa/
mmE5
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data
-
intfloat/mmE5-mllama-11b-instruct
Zero-Shot Image Classification • 11B • Updated • 127 • 20 -
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data
Paper • 2502.08468 • Published • 16 -
intfloat/mmE5-synthetic
Viewer • Updated • 560k • 1.53k • 6 -
intfloat/mmE5-MMEB-hardneg
Viewer • Updated • 1.47M • 703 • 1
e5-omni
A lightweight explicit alignment recipe that adapts off-the-shelf VLMs into robust omni-modal embedding models. https://arxiv.org/abs/2601.03666
MoCa
HomePage: https://haon-chen.github.io/MoCa/
Vidore-v2-full
mmE5
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data
-
intfloat/mmE5-mllama-11b-instruct
Zero-Shot Image Classification • 11B • Updated • 127 • 20 -
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data
Paper • 2502.08468 • Published • 16 -
intfloat/mmE5-synthetic
Viewer • Updated • 560k • 1.53k • 6 -
intfloat/mmE5-MMEB-hardneg
Viewer • Updated • 1.47M • 703 • 1
SPEED
Aligned embedding data synthesis models and embedding model. Our paper: https://arxiv.org/pdf/2410.18634