Standard Transformers Achieve the Minimax Rate in Nonparametric Regression with $C^{s,lambda}$ Targets

Standard Transformers Achieve the Minimax Rate in Nonparametric Regression with $C^{s,lambda}$ Targets










arXiv:2602.20555v1 Announce Type: new
Abstract: The tremendous success of Transformer models in fields such as large language models and computer vision necessitates a rigorous theoretical investigation. To the best of our knowledge, this paper is the first work proving that standard Transformers can approximate H”older functions $ C^{s,lambda}left([0,1]^{dtimes n}right) $$ (sinmathbb{N}_{geq0},0






Yanming Lai, Defeng Sun





Go to original source





Posted

in

, , , ,

by