Abstract
Neural Object Kinematics learns a data-driven parameterization for 4D dynamic object simulation by combining latent space representation with transformer-based encoding-decoding, enabling realistic temporal deformations across diverse object types.
Data-driven approaches have revolutionized 3D vision, enabling transformers to effectively reconstruct and generate static 3D objects. However, generating simulative 4D dynamics -- realistic temporal deformations of static objects under various physical conditions -- remains challenging and often ad hoc, despite its importance in building comprehensive 3D world models. Most existing methods assume a predefined physical model and use system identification to estimate parameters, restricting these methods to specific categories and small-scale datasets. We propose that these restrictions can be overcome by learning a data-driven kinematic state parameterization for object-centric physical systems. Specifically, we learn both a latent space representing all possible states of the object and a decoder that maps any sampled latent to a plausibly deformed shape of the object. We refer to this parameterization as Neural Object Kinematics (NeuROK), and learn a transformer-based encoder-decoder model on a curated large-scale 4D dataset. This formulation and the learned model significantly simplify the generation of simulative dynamics since we only need to consider the dynamics within a low-dimensional latent space from the Lagrangian mechanics' perspective in classical physics. We demonstrate the effectiveness and generality of this neural simulation framework across diverse dynamic object types, showing clear advantages over prior works. Project page: https://chen-geng.com/neurok
Community
NeuROK provides a data-driven kinematic state parameterization for object-centric systems, enabling the efficient generation of simulative 4D object dynamics without explicit physical models or prior inductive bias.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Velox: Learning Representations of 4D Geometry and Appearance (2026)
- ClothTransformer: Unified Latent-Space Transformers for Scalable Cloth Simulation (2026)
- MatPhys: Learning Material-Aware Physics Parameters for Deformable Object Simulation from Videos (2026)
- HO-Flow: Generalizable Hand-Object Interaction Generation with Latent Flow Matching (2026)
- ViPS: Video-informed Pose Spaces for Auto-Rigged Meshes (2026)
- RigidFormer: Learning Rigid Dynamics using Transformers (2026)
- CausalGS: Learning Physical Causality of 3D Dynamic Scenes with Gaussian Representations (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2605.30347 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper