7 58 51

Haiwen Diao

Paranioar

https://Paranioar.github.io/

AI & ML interests

Vision-and-Language, Parameter-efficient Transfer Learning, Multi-modal Large Language Model

Recent Activity

authored a paper 1 day ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

commentedon a paper 1 day ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

updated a collection 1 day ago

NEO1_5

View all activity

Organizations

authored a paper 1 day ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published 3 days ago • 65

commented a paper 1 day ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published 3 days ago • 65 •

updated a collection 1 day ago

NEO1_5

Collection

From Pixels to Words -- Towards Native One-Vision Models at Scale • 3 items • Updated 1 day ago • 6

upvoted a paper 1 day ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published 3 days ago • 65

submitted a paper to Daily Papers 1 day ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published 3 days ago • 65

liked 2 models 1 day ago

Paranioar/NEO1_5-2B-SFT

Image-Text-to-Text • 3B • Updated 1 day ago • 32 • 2

Paranioar/NEO1_5-9B-SFT

Image-Text-to-Text • 10B • Updated 1 day ago • 36 • 3

upvoted a collection 1 day ago

NEO1_5

Collection

From Pixels to Words -- Towards Native One-Vision Models at Scale • 3 items • Updated 1 day ago • 6

updated 2 models 1 day ago

Paranioar/NEO1_5-9B-SFT

Image-Text-to-Text • 10B • Updated 1 day ago • 36 • 3

Paranioar/NEO1_5-2B-SFT

Image-Text-to-Text • 3B • Updated 1 day ago • 32 • 2

published 2 models 2 days ago

Paranioar/NEO1_5-2B-SFT

Image-Text-to-Text • 3B • Updated 1 day ago • 32 • 2

Paranioar/NEO1_5-9B-SFT

Image-Text-to-Text • 10B • Updated 1 day ago • 36 • 3

upvoted 2 papers 2 days ago

LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence

Paper • 2605.25979 • Published 5 days ago • 24

SpatialBench: Is Your Spatial Foundation Model an All-Round Player?

Paper • 2605.27367 • Published 4 days ago • 64

upvoted a paper 7 days ago

PhysX-Omni: Unified Simulation-Ready Physical 3D Generation for Rigid, Deformable, and Articulated Objects

Paper • 2605.21572 • Published 10 days ago • 51

liked a model 13 days ago

sensenova/SenseNova-U1-8B-MoT-Infographic

Any-to-Any • 18B • Updated 13 days ago • 5.39k • 39

authored 3 papers 15 days ago

DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation

Paper • 2601.22153 • Published Jan 29 • 75

VISTA-Bench: Do Vision-Language Models Really Understand Visualized Text as Well as Pure Text?

Paper • 2602.04802 • Published Feb 4 • 2

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published 18 days ago • 191

updated a collection 17 days ago

SenseNova-U1

Collection

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 9 items • Updated 1 day ago • 67

Haiwen Diao

AI & ML interests

Recent Activity

Organizations

Paranioar's activity