Reinforcement Learning for LLM-based Multi-Agent Systems through Orchestration Traces
Reinforcement Learning for LLM-based Multi-Agent Systems through Orchestration Traces
要約
As large language model (LLM) agents evolve from isolated tool users into coordinated teams, reinforcement learning (RL) must optimize not only individual actions but also how work is spawned, delegated, communicated, aggregated, and stopped. This paper studies RL for LLM-based multi-agent systems t…