論文深掘り Hugging Face 発表: 2026-05-31 HF ↑12

MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation

著者: Wenhao Wang, Peizhi Niu, Gongyi Zou, Xiyuan Yang, Jingxing Wang ほか7名

要約

The Model Context Protocol (MCP) has emerged as a transformative standard for connecting large language models (LLMs) with external data sources and tools, and has been rapidly adopted across personal applications and development platforms. However, existing benchmarks predominantly focus on generic…

#benchmark#agent#llm

MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation

要約

同じカテゴリの記事

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

World-R1: テキストから動画生成における3D制約の強化学習による整合

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents