ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published May 26 • 104
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning Paper • 2506.10521 • Published Jun 12 • 73
AutoMind: Adaptive Knowledgeable Agent for Automated Data Science Paper • 2506.10974 • Published Jun 12 • 19