論文 arXiv 発表: 2026-05-05

Safety and accuracy follow different scaling laws in clinical large language models

Safety and accuracy follow different scaling laws in clinical large language models

著者: Sebastian Wind, Tri-Thien Nguyen, Jeta Sopa, Mahshad Lotfinia, Sebastian Bickelhaup ほか7名

要約

Clinical LLMs are often scaled by increasing model size, context length, retrieval complexity, or inference-time compute, with the implicit expectation that higher accuracy implies safer behavior. This assumption is incomplete in medicine, where a few confident, high-risk, or evidence-contradicting …

#alignment#llm#rag#agent#benchmark

同じカテゴリの記事