MoChat: Joints-Grouped Spatio-Temporal Grounding LLM for Multi-Turn Motion Comprehension and Description
Paper • 2410.11404 • Published • 1
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
MoChat is a Multimodal Large Language Model (MLLM) that revolutionizes human motion understanding through precise spatio-temporal grounding. Unlike conventional motion analysis systems, MoChat integrates:
We provide the following trained models for download: