Tag: metagradient

Optimizing ML Training with Metagradient Descent

Optimizing ML Training with Metagradient Descent arXiv:2503.13751v1 Announce Type: new Abstract: A major challenge in training large-scale machine learning models is configuring the training process to maximize model performance, i.e., finding the best training setup from a vast design space. In this work, we unlock a gradient-based approach to this problem. We first introduce an…

March 19, 2025

Optimizing ML Training with Metagradient Descent