企業動向 Microsoft Research 発表: 2026-05-11

SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests

要約

Using SocialReasoning Bench, we observed a stable pattern across models—agents execute competently, but fail to consistently improve the user’s position, even with explicit instructions to optimize for user interest. The post SocialReasoning-Bench: Measuring whether AI agents act in users’ best inte…

#agent

SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests

要約

同じカテゴリの記事

Parloa builds service agents customers want to talk to

OpenAIモデル・Codex・マネージドエージェントがAWSに登場

SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests