論文 深掘り Hugging Face 発表: 2026-05-27 HF ↑45

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

著者: Qiuyue Wang, Mingsheng Li, Jian Guan, Jinhui Ye, Sicheng Xie ほか35名

要約

Embodied intelligence is often studied through specialized models for individual tasks such as manipulation or navigation, resulting in fragmented capabilities and limited generalization across tasks, environments, and robot embodiments. In this work, we study whether heterogeneous embodied decision…

#robotics#benchmark

同じカテゴリの記事