Generative Prompt Model for Weakly Supervised Object Localization Paper • 2307.09756 • Published Jul 19, 2023
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models Paper • 2303.11681 • Published Mar 21, 2023
DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution Paper • 2405.16071 • Published May 25, 2024 • 3
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model Paper • 2411.19108 • Published Nov 28, 2024 • 20
Evaluation of Text-to-Video Generation Models: A Dynamics Perspective Paper • 2407.01094 • Published Jul 1, 2024
From Easy to Hard: Building a Shortcut for Differentially Private Image Synthesis Paper • 2504.01395 • Published Apr 2, 2025
Model as a Game: On Numerical and Spatial Consistency for Generative Games Paper • 2503.21172 • Published Mar 27, 2025
DocReward: A Document Reward Model for Structuring and Stylizing Paper • 2510.11391 • Published Oct 13, 2025 • 27
Balancing Understanding and Generation in Discrete Diffusion Models Paper • 2602.01362 • Published 4 days ago • 13
Balancing Understanding and Generation in Discrete Diffusion Models Paper • 2602.01362 • Published 4 days ago • 13
Balancing Understanding and Generation in Discrete Diffusion Models Paper • 2602.01362 • Published 4 days ago • 13
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints Paper • 2510.14847 • Published Oct 16, 2025 • 56
DocReward: A Document Reward Model for Structuring and Stylizing Paper • 2510.11391 • Published Oct 13, 2025 • 27
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information? Paper • 2412.02611 • Published Dec 3, 2024 • 25