Abstract
Implicit Preference Alignment (IPA) addresses hand motion generation challenges through data-efficient post-training that eliminates need for paired preference data while using hand-aware local optimization for improved quality.
Human image animation has witnessed significant advancements, yet generating high-fidelity hand motions remains a persistent challenge due to their high degrees of freedom and motion complexity. While reinforcement learning from human feedback, particularly direct preference optimization, offers a potential solution, it necessitates the construction of strict preference pairs. However, curating such pairs for dynamic hand regions is prohibitively expensive and often impractical due to frame-wise inconsistencies. In this paper, we propose Implicit Preference Alignment (IPA), a data-efficient post-training framework that eliminates the need for paired preference data. Theoretically grounded in implicit reward maximization, IPA aligns the model by maximizing the likelihood of self-generated high-quality samples while penalizing deviations from the pretrained prior. Furthermore, we introduce a Hand-Aware Local Optimization mechanism to explicitly steer the alignment process toward hand regions. Experiments demonstrate that our method achieves effective preference optimization to enhance hand generation quality, while significantly lowering the barrier for constructing preference data. Codes are released at https://github.com/mdswyz/IPA
Community
Implicit Preference Alignment that removes the need for bad samples for preference optimization.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- HP-Edit: A Human-Preference Post-Training Framework for Image Editing (2026)
- VIGOR: VIdeo Geometry-Oriented Reward for Temporal Generative Alignment (2026)
- A Systematic Post-Train Framework for Video Generation (2026)
- Reward-Aware Trajectory Shaping for Few-step Visual Generation (2026)
- World-R1: Reinforcing 3D Constraints for Text-to-Video Generation (2026)
- ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis (2026)
- GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2605.07545 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper