1 21 14

junmingyang

jmyang

https://junming-yang.github.io/

junming-yang

AI & ML interests

LLM Alignment, VLM

Recent Activity

upvoted a paper 14 days ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

authored a paper 2 months ago

VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

authored a paper 2 months ago

Alignment through Meta-Weighted Online Sampling: Bridging the Gap between Data Generation and Preference Optimization

View all activity

Organizations

None yet

upvoted a paper 14 days ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published 14 days ago • 54

upvoted 3 papers 2 months ago

VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

Paper • 2407.11691 • Published Jul 16, 2024 • 15

Alignment through Meta-Weighted Online Sampling: Bridging the Gap between Data Generation and Preference Optimization

Paper • 2509.23371 • Published Sep 27 • 5

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26 • 70

upvoted a paper 3 months ago

UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning

Paper • 2509.11543 • Published Sep 15 • 47

upvoted a paper 5 months ago

Pixels, Patterns, but No Poetry: To See The World like Humans

Paper • 2507.16863 • Published Jul 21 • 68

upvoted a paper 8 months ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 303

upvoted an article 8 months ago

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Feb 11

•

upvoted a paper 9 months ago

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published Mar 16 • 35

upvoted a paper 10 months ago

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 103

upvoted a paper about 1 year ago

CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Paper • 2410.16256 • Published Oct 21, 2024 • 60

upvoted an article about 1 year ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Dec 9, 2022

•

376

upvoted 5 papers over 1 year ago

GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI

Paper • 2408.03361 • Published Aug 6, 2024 • 85

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6, 2024 • 61

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

Paper • 2408.02657 • Published Aug 5, 2024 • 35

VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24, 2024 • 41

NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

Paper • 2407.11963 • Published Jul 16, 2024 • 44

upvoted a collection over 1 year ago

InternVL2.0

Collection

Expanding Performance Boundaries of Open-Source MLLM • 15 items • Updated Sep 28 • 89

upvoted 2 papers over 1 year ago

MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding

Paper • 2406.14515 • Published Jun 20, 2024 • 33

Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs

Paper • 2406.14544 • Published Jun 20, 2024 • 35