Keyu Duan

vermouthdky

https://kduan.live

vermouthdky

AI & ML interests

LLM Reasoning and Safety

Recent Activity

upvoted a paper about 1 month ago

Diffusion Language Models are Super Data Learners

upvoted a paper about 1 month ago

Defeating the Training-Inference Mismatch via FP16

upvoted a paper about 1 month ago

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

View all activity

Organizations

upvoted 3 papers about 1 month ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5 • 124

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published Oct 30 • 29

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Paper • 2510.27492 • Published Oct 30 • 81

updated a dataset about 1 month ago

axon-rl/webshop_instructions

Viewer • Updated Oct 27 • 6.91k • 50

published a dataset about 1 month ago

axon-rl/webshop_instructions

Viewer • Updated Oct 27 • 6.91k • 50

updated a dataset about 1 month ago

axon-rl/webshop

Viewer • Updated Oct 27 • 1k • 42

published a dataset about 1 month ago

axon-rl/webshop

Viewer • Updated Oct 27 • 1k • 42

authored 2 papers 2 months ago

Efficient Process Reward Model Training via Active Learning

Paper • 2504.10559 • Published Apr 14 • 13

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1 • 89

upvoted 3 papers 2 months ago

upvoted 3 papers 7 months ago

Fostering Video Reasoning via Next-Event Prediction

Paper • 2505.22457 • Published May 28 • 29

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27 • 26

Lifelong Safety Alignment for Language Models

Paper • 2505.20259 • Published May 26 • 23

upvoted a paper 8 months ago

Efficient Process Reward Model Training via Active Learning

Paper • 2504.10559 • Published Apr 14 • 13

updated 2 models 8 months ago

sail/ActPRM-X

7B • Updated Apr 15 • 1.28k

sail/ActPRM

7B • Updated Apr 15 • 24

updated a collection 8 months ago

🚀 Active PRM

Collection

Efficient Process Reward Model Training via Active Learning. • 4 items • Updated Apr 16 • 3

Keyu Duan

AI & ML interests

Recent Activity

Organizations

vermouthdky's activity