WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation Paper • 2508.16763 • Published Aug 22 • 2
Improving GUI Grounding with Explicit Position-to-Coordinate Mapping Paper • 2510.03230 • Published Oct 3 • 3
BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning Paper • 2508.09804 • Published Aug 13
DRBench: A Realistic Benchmark for Enterprise Deep Research Paper • 2510.00172 • Published Sep 30 • 1
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published 27 days ago • 104
ColMate: Contrastive Late Interaction and Masked Text for Multimodal Document Retrieval Paper • 2511.00903 • Published Nov 2
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published 27 days ago • 104
Value Drifts: Tracing Value Alignment During LLM Post-Training Paper • 2510.26707 • Published Oct 30 • 12
WebMMU Collection WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation • 2 items • Updated Sep 16 • 2
How to Train Your LLM Web Agent: A Statistical Diagnosis Paper • 2507.04103 • Published Jul 5 • 50
Using In-Context Learning to Improve Dialogue Safety Paper • 2302.00871 • Published Feb 2, 2023 • 1
DialGuide: Aligning Dialogue Model Behavior with Developer Guidelines Paper • 2212.10557 • Published Dec 20, 2022