論文 Hugging Face 発表: 2026-05-04 HF ↑2

Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies

著者: Zirui Tang, Xuanhe Zhou, Yumou Liu, Linchun Li, Weizheng Wang ほか15名

要約

Workspace learning requires AI agents to identify, reason over, exploit, and update explicit and implicit dependencies among heterogeneous files in a worker’s workspace, enabling them to complete both routine and advanced tasks effectively. Despite its importance, existing relevant benchmarks largel…

#agent#benchmark

Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies

要約

同じカテゴリの記事

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

World-R1: テキストから動画生成における3D制約の強化学習による整合

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents