Probing Preference Representations: A Multi-Dimensional Evaluation and Analysis Method for Reward Models
wangchenglong
wangclnlp
AI & ML interests
None yet
Recent Activity
upvoted a paper about 24 hours ago
MSRL: Scaling Generative Multimodal Reward Modeling via Multi-Stage Reinforcement Learning upvoted a paper 8 days ago
AI Can Learn Scientific Taste upvoted a paper about 2 months ago
Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing