論文 arXiv 発表: 2026-05-20

WikiVQABench: A Knowledge-Grounded Visual Question Answering Benchmark from Wikipedia and Wikidata

著者: Basel Shbita, Pengyuan Li, Anna Lisa Gentile

要約

Visual Question Answering (VQA) benchmarks have largely emphasized perception-based tasks that can be solved from visual content alone. In contrast, many real-world scenarios require external knowledge that is not directly observable in the image to answer correctly. We introduce WikiVQABench, a hum…

#benchmark#multimodal#llm

WikiVQABench: A Knowledge-Grounded Visual Question Answering Benchmark from Wikipedia and Wikidata

要約

同じカテゴリの記事

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

World-R1: テキストから動画生成における3D制約の強化学習による整合

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents