Anti Mode-Collapse in Mean-Field Transformer via Auxiliary Variables
Anti Mode-Collapse in Mean-Field Transformer via Auxiliary Variables
要約
We use a mean-field-based transformer model to theoretically investigate how auxiliary variables, such as positional encoding, prevent mode collapse of self-attention mechanisms. The use of mean-field transformers to analyze the properties of self-attention mechanisms has garnered significant attent…