Tag: scaling

Scaling ML Inference on Databricks: Liquid or Partitioned? Salted or Not?

Scaling ML Inference on Databricks: Liquid or Partitioned? Salted or Not? A case study on techniques to maximize your clusters The post Scaling ML Inference on Databricks: Liquid or Partitioned? Salted or Not? appeared first on Towards Data Science. Hector Mejia Go to original source

March 1, 2026
Scaling Recommender Transformers to a Billion Parameters

Scaling Recommender Transformers to a Billion Parameters How to implement a new generation of transformer recommenders The post Scaling Recommender Transformers to a Billion Parameters appeared first on Towards Data Science. Kirill Кhrylchenko Go to original source

October 22, 2025
Conditional Multidimensional Scaling with Incomplete Conditioning Data

Conditional Multidimensional Scaling with Incomplete Conditioning Data arXiv:2509.16627v1 Announce Type: new Abstract: Conditional multidimensional scaling seeks for a low-dimensional configuration from pairwise dissimilarities, in the presence of other known features. By taking advantage of available data of the known features, conditional multidimensional scaling improves the estimation quality of the low-dimensional configuration and simplifies knowledge discovery…

September 23, 2025
Scaling Laws for Uncertainty in Deep Learning

Scaling Laws for Uncertainty in Deep Learning arXiv:2506.09648v1 Announce Type: new Abstract: Deep learning has recently revealed the existence of scaling laws, demonstrating that model performance follows predictable trends based on dataset and model sizes. Inspired by these findings and fascinating phenomena emerging in the over-parameterized regime, we examine a parallel direction: do similar scaling…

June 12, 2025
Dimension-adapted Momentum Outscales SGD

Dimension-adapted Momentum Outscales SGD arXiv:2505.16098v1 Announce Type: new Abstract: We investigate scaling laws for stochastic momentum algorithms with small batch on the power law random features model, parameterized by data complexity, target complexity, and model size. When trained with a stochastic momentum algorithm, our analysis reveals four distinct loss curve shapes determined by varying data-target…

May 23, 2025
Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment

Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment arXiv:2503.21878v1 Announce Type: cross Abstract: Inference-time computation provides an important axis for scaling language model performance, but naively scaling compute through techniques like Best-of-$N$ sampling can cause performance to degrade due to reward hacking. Toward a theoretical understanding of how to best…

March 31, 2025