Model Summary

UnifiedReward-Flex-qwen3vl-32b is a unified personalized reward model for vision generation that couples reward modeling with flexible and context-adaptive reasoning!!

πŸš€ The inference code is available at Github.

For further details, please refer to the following resources:

Citation

@article{unifiedreward-flex,
  title={Unified Personalized Reward Model for Vision Generation},
  author={Wang, Yibin and Zang, Yuhang and Han, Feng and Zhou, Yujie and Bu, Jiazi and Jin, Cheng and Wang, Jiaqi},
  journal={arXiv preprint arXiv:2602.02380},
  year={2026}
}
Downloads last month
10
Safetensors
Model size
1.14M params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for CodeGoat24/UnifiedReward-Flex-qwen3vl-32b

Dataset used to train CodeGoat24/UnifiedReward-Flex-qwen3vl-32b

Collection including CodeGoat24/UnifiedReward-Flex-qwen3vl-32b

Paper for CodeGoat24/UnifiedReward-Flex-qwen3vl-32b