論文 深掘り arXiv 発表: 2026-05-12

Model-based Bootstrap of Controlled Markov Chains

Model-based Bootstrap of Controlled Markov Chains

著者: Ziwei Su, Imon Banerjee, Diego Klabjan

要約

We propose and analyze a model-based bootstrap for transition kernels in finite controlled Markov chains (CMCs) with possibly nonstationary or history-dependent control policies, a setting that arises naturally in offline reinforcement learning (RL) when the behavior policy generating the data is un…

#llm#rl#benchmark

同じカテゴリの記事