論文 arXiv 発表: 2026-05-12

Learning, Fast and Slow: Towards LLMs That Adapt Continually

著者: Rishabh Tiwari, Kusha Sareen, Lakshya A Agrawal, Joseph E. Gonzalez, Matei Zaharia ほか4名

要約

Large language models (LLMs) are trained for downstream tasks by updating their parameters (e.g., via RL). However, updating parameters forces them to absorb task-specific information, which can result in catastrophic forgetting and loss of plasticity. In contrast, in-context learning with fixed LLM…

#llm#rl

Learning, Fast and Slow: Towards LLMs That Adapt Continually

要約

同じカテゴリの記事

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

World-R1: テキストから動画生成における3D制約の強化学習による整合

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents