Safety and accuracy follow different scaling laws in clinical large language models
Safety and accuracy follow different scaling laws in clinical large language models
要約
Clinical LLMs are often scaled by increasing model size, context length, retrieval complexity, or inference-time compute, with the implicit expectation that higher accuracy implies safer behavior. This assumption is incomplete in medicine, where a few confident, high-risk, or evidence-contradicting …