Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models Paper • 2605.17672 • Published 25 days ago • 22
EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents Paper • 2605.13941 • Published 29 days ago • 24
Covering Human Action Space for Computer Use: Data Synthesis and Benchmark Paper • 2605.12501 • Published about 1 month ago • 16
Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria Paper • 2605.08354 • Published May 8 • 23
BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning Paper • 2603.04918 • Published Mar 5 • 56
GEditBench v2: A Human-Aligned Benchmark for General Image Editing Paper • 2603.28547 • Published Mar 30 • 32
GEditBench v2: A Human-Aligned Benchmark for General Image Editing Paper • 2603.28547 • Published Mar 30 • 32
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights Paper • 2512.01816 • Published Dec 1, 2025 • 95