Tag: td

TD(0) Learning converges for Polynomial mixing and non-linear functions

TD(0) Learning converges for Polynomial mixing and non-linear functions arXiv:2502.05706v1 Announce Type: new Abstract: Theoretical work on Temporal Difference (TD) learning has provided finite-sample and high-probability guarantees for data generated from Markov chains. However, these bounds typically require linear function approximation, instance-dependent step sizes, algorithmic modifications, and restrictive mixing rates. We present theoretical findings for…

February 11, 2025

TD(0) Learning converges for Polynomial mixing and non-linear functions