論文 Hugging Face 発表: 2026-05-27 HF ↑14

LaRA: Layer-wise Representation Analysis for Detecting Data Contamination in RL Post-Training

LaRA: Layer-wise Representation Analysis for Detecting Data Contamination in RL Post-Training

著者: Minju Gwak, Minseo Kwak, Dongseok Lee, Guijin Son, Alan Ritter ほか1名

要約

Reinforcement learning (RL) post-training has shown to improve reasoning in large language models (LLMs). However, there has been little exploration on the problem of data contamination in RL post-training, potentially undermining generalization and evaluation reliability of the training process its…

#llm#rl#benchmark

同じカテゴリの記事