論文 Hugging Face 発表: 2026-05-17 HF ↑1

MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents

MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents

著者: Ziyun Zeng, Hang Hua, Bocheng Zou, Mu Cai, Rogerio Feris ほか1名

要約

Recent GUI agents have made substantial progress in visual grounding and action prediction, yet they remain brittle in long-horizon tasks that require maintaining task state across many interface transitions. Existing agents typically rely on raw history replay or text-only memory, which either over…

#agent#llm#benchmark#multimodal#fine-tuning

同じカテゴリの記事