ZhangXiaoyun's picture

ZhangXiaoyun

DadaCloud01

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting

authored a paper 2 months ago

Rediscovering Entropy Regularization: Adaptive Coefficient Unlocks Its Potential for LLM Reinforcement Learning

authored a paper 2 months ago

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

View all activity

Organizations

authored 4 papers 2 months ago

Rediscovering Entropy Regularization: Adaptive Coefficient Unlocks Its Potential for LLM Reinforcement Learning

Paper • 2510.10959 • Published Oct 13, 2025 • 2

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Paper • 2602.10604 • Published Feb 11 • 199

Reasoner for Real-World Event Detection: Scaling Reinforcement Learning via Adaptive Perplexity-Aware Sampling Strategy

Paper • 2507.01327 • Published Jul 2, 2025 • 1

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

Paper • 2603.16448 • Published Mar 17 • 58

submitted a paper to Daily Papers 2 months ago

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

Paper • 2603.16448 • Published Mar 17 • 58

authored 2 papers 5 months ago

CodeV-R1: Reasoning-Enhanced Verilog Generation

Paper • 2505.24183 • Published May 30, 2025 • 9

Step-DeepResearch Technical Report

Paper • 2512.20491 • Published Dec 23, 2025 • 88

authored 2 papers about 1 year ago

Adversarial Contrastive Decoding: Boosting Safety Alignment of Large Language Models via Opposite Prompt Optimization

Paper • 2406.16743 • Published Jun 24, 2024 • 1

When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning

Paper • 2505.15400 • Published May 21, 2025 • 23