Tag: learning

  • The Volterra signature

    The Volterra signature arXiv:2603.04525v1 Announce Type: new Abstract: Modern approaches for learning from non-Markovian time series, such as recurrent neural networks, neural controlled differential equations or transformers, typically rely on implicit memory mechanisms that can be difficult to interpret or to train over long horizons. We propose the Volterra signature $mathrm{VSig}(x;K)$ as a principled, explicit…

  • Learning Order Forest for Qualitative-Attribute Data Clustering

    Learning Order Forest for Qualitative-Attribute Data Clustering arXiv:2603.03387v1 Announce Type: new Abstract: Clustering is a fundamental approach to understanding data patterns, wherein the intuitive Euclidean distance space is commonly adopted. However, this is not the case for implicit cluster distributions reflected by qualitative attribute values, e.g., the nominal values of attributes like symptoms, marital status,…

  • The Machine Learning Lessons I’ve Learned This Month

    The Machine Learning Lessons I’ve Learned This Month February 2026: exchange with others, documentation, and MLOps The post The Machine Learning Lessons I’ve Learned This Month appeared first on Towards Data Science. Pascal Janetzky Go to original source

  • Efficient Uncoupled Learning Dynamics with $tilde{O}!left(T^{-1/4}right)$ Last-Iterate Convergence in Bilinear Saddle-Point Problems over Convex Sets under Bandit Feedback

    Efficient Uncoupled Learning Dynamics with $tilde{O}!left(T^{-1/4}right)$ Last-Iterate Convergence in Bilinear Saddle-Point Problems over Convex Sets under Bandit Feedback arXiv:2602.21436v1 Announce Type: new Abstract: In this paper, we study last-iterate convergence of learning algorithms in bilinear saddle-point problems, a preferable notion of convergence that captures the day-to-day behavior of learning dynamics. We focus on the challenging…

  • Optimizing Deep Learning Models with SAM

    Optimizing Deep Learning Models with SAM A deep dive into the Sharpness-Aware-Minimization (SAM) algorithm and how it improves the generalizability of modern deep learning models The post Optimizing Deep Learning Models with SAM appeared first on Towards Data Science. Anindya Dey Go to original source

  • Interactive Learning of Single-Index Models via Stochastic Gradient Descent

    Interactive Learning of Single-Index Models via Stochastic Gradient Descent arXiv:2602.17876v1 Announce Type: new Abstract: Stochastic gradient descent (SGD) is a cornerstone algorithm for high-dimensional optimization, renowned for its empirical successes. Recent theoretical advances have provided a deep understanding of how SGD enables feature learning in high-dimensional nonlinear models, most notably the textit{single-index model} with i.i.d.…

  • Agentic AI for Modern Deep Learning Experimentation

    Agentic AI for Modern Deep Learning Experimentation Stop babysitting training runs. Start shipping research. Autonomous experiment management built for/by deep learning engineers. The post Agentic AI for Modern Deep Learning Experimentation appeared first on Towards Data Science. Sam Black Go to original source

  • Near-Optimal Sample Complexity for Online Constrained MDPs

    Near-Optimal Sample Complexity for Online Constrained MDPs arXiv:2602.15076v1 Announce Type: cross Abstract: Safety is a fundamental challenge in reinforcement learning (RL), particularly in real-world applications such as autonomous driving, robotics, and healthcare. To address this, Constrained Markov Decision Processes (CMDPs) are commonly used to enforce safety constraints while optimizing performance. However, existing methods often suffer…

  • The Cost of Learning under Multiple Change Points

    The Cost of Learning under Multiple Change Points arXiv:2602.11406v1 Announce Type: new Abstract: We consider an online learning problem in environments with multiple change points. In contrast to the single change point problem that is widely studied using classical “high confidence” detection schemes, the multiple change point environment presents new learning-theoretic and algorithmic challenges. Specifically,…

  • The Machine Learning Lessons I’ve Learned Last Month

    The Machine Learning Lessons I’ve Learned Last Month Delayed January: deadlines, downtimes, and flow times The post The Machine Learning Lessons I’ve Learned Last Month appeared first on Towards Data Science. Pascal Janetzky Go to original source

  • Efficient Causal Structure Learning via Modular Subgraph Integration

    Efficient Causal Structure Learning via Modular Subgraph Integration arXiv:2601.21014v1 Announce Type: new Abstract: Learning causal structures from observational data remains a fundamental yet computationally intensive task, particularly in high-dimensional settings where existing methods face challenges such as the super-exponential growth of the search space and increasing computational demands. To address this, we introduce VISTA (Voting-based…

  • Minimax Rates for Hyperbolic Hierarchical Learning

    Minimax Rates for Hyperbolic Hierarchical Learning arXiv:2601.20047v1 Announce Type: new Abstract: We prove an exponential separation in sample complexity between Euclidean and hyperbolic representations for learning on hierarchical data under standard Lipschitz regularization. For depth-$R$ hierarchies with branching factor $m$, we first establish a geometric obstruction for Euclidean space: any bounded-radius embedding forces volumetric collapse,…

  • Federated Learning, Part 2: Implementation with the Flower Framework 🌼

    Federated Learning, Part 2: Implementation with the Flower Framework 🌼 Implementing cross-silo federated learning step by step The post Federated Learning, Part 2: Implementation with the Flower Framework 🌼 appeared first on Towards Data Science. Parul Pandey Go to original source

  • Machine Learning in Production? What This Really Means

    Machine Learning in Production? What This Really Means From notebooks to real-world systems The post Machine Learning in Production? What This Really Means appeared first on Towards Data Science. Sabrine Bendimerad Go to original source

  • Implicit Q-Learning and SARSA: Liberating Policy Control from Step-Size Calibration

    Implicit Q-Learning and SARSA: Liberating Policy Control from Step-Size Calibration arXiv:2601.18907v1 Announce Type: new Abstract: Q-learning and SARSA are foundational reinforcement learning algorithms whose practical success depends critically on step-size calibration. Step-sizes that are too large can cause numerical instability, while step-sizes that are too small can lead to slow progress. We propose implicit variants…

  • Double Fairness Policy Learning: Integrating Action Fairness and Outcome Fairness in Decision-making

    Double Fairness Policy Learning: Integrating Action Fairness and Outcome Fairness in Decision-making arXiv:2601.19186v1 Announce Type: new Abstract: Fairness is a central pillar of trustworthy machine learning, especially in domains where accuracy- or profit-driven optimization is insufficient. While most fairness research focuses on supervised learning, fairness in policy learning remains less explored. Because policy learning is…

  • Efficient Learning of Stationary Diffusions with Stein-type Discrepancies

    Efficient Learning of Stationary Diffusions with Stein-type Discrepancies arXiv:2601.16597v1 Announce Type: new Abstract: Learning a stationary diffusion amounts to estimating the parameters of a stochastic differential equation whose stationary distribution matches a target distribution. We build on the recently introduced kernel deviation from stationarity (KDS), which enforces stationarity by evaluating expectations of the diffusion’s generator…

  • A Theory of Diversity for Random Matrices with Applications to In-Context Learning of Schr”odinger Equations

    A Theory of Diversity for Random Matrices with Applications to In-Context Learning of Schr”odinger Equations arXiv:2601.12587v1 Announce Type: new Abstract: We address the following question: given a collection ${mathbf{A}^{(1)}, dots, mathbf{A}^{(N)}}$ of independent $d times d$ random matrices drawn from a common distribution $mathbb{P}$, what is the probability that the centralizer of ${mathbf{A}^{(1)}, dots, mathbf{A}^{(N)}}$…

  • A brief note on learning problem with global perspectives

    A brief note on learning problem with global perspectives arXiv:2601.05441v1 Announce Type: new Abstract: This brief note considers the problem of learning with dynamic-optimizing principal-agent setting, in which the agents are allowed to have global perspectives about the learning process, i.e., the ability to view things according to their relative importances or in their true…

  • Federated Learning, Part 1: The Basics of Training Models Where the Data Lives

    Federated Learning, Part 1: The Basics of Training Models Where the Data Lives Understanding the foundations of federated learning The post Federated Learning, Part 1: The Basics of Training Models Where the Data Lives appeared first on Towards Data Science. Parul Pandey Go to original source

  • Stochastic Deep Learning: A Probabilistic Framework for Modeling Uncertainty in Structured Temporal Data

    Stochastic Deep Learning: A Probabilistic Framework for Modeling Uncertainty in Structured Temporal Data arXiv:2601.05227v1 Announce Type: new Abstract: I propose a novel framework that integrates stochastic differential equations (SDEs) with deep generative models to improve uncertainty quantification in machine learning applications involving structured and temporal data. This approach, termed Stochastic Latent Differential Inference (SLDI), embeds…

  • Microeconomic Foundations of Multi-Agent Learning

    Microeconomic Foundations of Multi-Agent Learning arXiv:2601.03451v1 Announce Type: new Abstract: Modern AI systems increasingly operate inside markets and institutions where data, behavior, and incentives are endogenous. This paper develops an economic foundation for multi-agent learning by studying a principal-agent interaction in a Markov decision process with strategic externalities, where both the principal and the agent…

  • Self-Supervised Learning from Noisy and Incomplete Data

    Self-Supervised Learning from Noisy and Incomplete Data arXiv:2601.03244v1 Announce Type: new Abstract: Many important problems in science and engineering involve inferring a signal from noisy and/or incomplete observations, where the observation process is known. Historically, this problem has been tackled using hand-crafted regularization (e.g., sparsity, total-variation) to obtain meaningful estimates. Recent data-driven methods often offer…

  • The Best Data Scientists Are Always Learning

    The Best Data Scientists Are Always Learning Part 2: Avoiding burnout, learning strategies and the superpower of solitude The post The Best Data Scientists Are Always Learning appeared first on Towards Data Science. Jarom Hulet Go to original source

  • Fibonacci-Driven Recursive Ensembles: Algorithms, Convergence, and Learning Dynamics

    Fibonacci-Driven Recursive Ensembles: Algorithms, Convergence, and Learning Dynamics arXiv:2601.01055v1 Announce Type: new Abstract: This paper develops the algorithmic and dynamical foundations of recursive ensemble learning driven by Fibonacci-type update flows. In contrast with classical boosting Freund and Schapire (1997); Friedman (2001), where the ensemble evolves through first-order additive updates, we study second-order recursive architectures in…

  • Active learning for data-driven reduced models of parametric differential systems with Bayesian operator inference

    Active learning for data-driven reduced models of parametric differential systems with Bayesian operator inference arXiv:2601.00038v1 Announce Type: new Abstract: This work develops an active learning framework to intelligently enrich data-driven reduced-order models (ROMs) of parametric dynamical systems, which can serve as the foundation of virtual assets in a digital twin. Data-driven ROMs are explainable, computationally…

  • Learning Python by doing projects: What does that even mean?

    Learning Python by doing projects: What does that even mean? I’m learning Python and considering this approach: choose a real dataset, frame a question I want to answer, then work toward it step by step by breaking it into small tasks and researching each step as needed. For those of you who are already comfortable…

  • Drift Detection in Robust Machine Learning Systems

    Drift Detection in Robust Machine Learning Systems A prerequisite for long-term success of machine learning systems The post Drift Detection in Robust Machine Learning Systems appeared first on Towards Data Science. Morris Stallmann Go to original source

  • Deep Reinforcement Learning: The Actor-Critic Method

    Deep Reinforcement Learning: The Actor-Critic Method Robot friends collaborate to learn to fly a drone The post Deep Reinforcement Learning: The Actor-Critic Method appeared first on Towards Data Science. Vedant Jumle Go to original source

  • The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel

    The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel AUC measures how well a model ranks positives above negatives, independent of any chosen threshold. The post The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel appeared first on Towards Data Science. angela shi Go to original source

  • Machine Learning vs AI Engineer: What Are the Differences?

    Machine Learning vs AI Engineer: What Are the Differences? One of the most confusing questions in tech right now is: What is the difference between an AI engineer and a machine learning engineer? Both are six-figure jobs, but if you choose the wrong one, you could waste months of your career learning the wrong skills…

  • Gaussian Process Assisted Meta-learning for Image Classification and Object Detection Models

    Gaussian Process Assisted Meta-learning for Image Classification and Object Detection Models arXiv:2512.20021v1 Announce Type: new Abstract: Collecting operationally realistic data to inform machine learning models can be costly. Before collecting new data, it is helpful to understand where a model is deficient. For example, object detectors trained on images of rare objects may not be…

  • The Machine Learning “Advent Calendar” Day 20: Gradient Boosted Linear Regression in Excel

    The Machine Learning “Advent Calendar” Day 20: Gradient Boosted Linear Regression in Excel From Random Ensembles to Optimization: Gradient Boosting Explained The post The Machine Learning “Advent Calendar” Day 20: Gradient Boosted Linear Regression in Excel appeared first on Towards Data Science. angela shi Go to original source

  • The Machine Learning “Advent Calendar” Day 19: Bagging in Excel

    The Machine Learning “Advent Calendar” Day 19: Bagging in Excel Understanding ensemble learning from first principles in Excel The post The Machine Learning “Advent Calendar” Day 19: Bagging in Excel appeared first on Towards Data Science. angela shi Go to original source

  • DAG Learning from Zero-Inflated Count Data Using Continuous Optimization

    DAG Learning from Zero-Inflated Count Data Using Continuous Optimization arXiv:2512.16233v1 Announce Type: new Abstract: We address network structure learning from zero-inflated count data by casting each node as a zero-inflated generalized linear model and optimizing a smooth, score-based objective under a directed acyclic graph constraint. Our Zero-Inflated Continuous Optimization (ZICO) approach uses node-wise likelihoods with…

  • Advantages and limitations in the use of transfer learning for individual treatment effects in causal machine learning

    Advantages and limitations in the use of transfer learning for individual treatment effects in causal machine learning arXiv:2512.16489v1 Announce Type: new Abstract: Generalizing causal knowledge across diverse environments is challenging, especially when estimates from large-scale datasets must be applied to smaller or systematically different contexts, where external validity is critical. Model-based estimators of individual treatment…

  • The Machine Learning “Advent Calendar” Day 18: Neural Network Classifier in Excel

    The Machine Learning “Advent Calendar” Day 18: Neural Network Classifier in Excel Understanding forward propagation and backpropagation through explicit formulas The post The Machine Learning “Advent Calendar” Day 18: Neural Network Classifier in Excel appeared first on Towards Data Science. angela shi Go to original source

  • A Teacher-Student Perspective on the Dynamics of Learning Near the Optimal Point

    A Teacher-Student Perspective on the Dynamics of Learning Near the Optimal Point arXiv:2512.15606v1 Announce Type: new Abstract: Near an optimal learning point of a neural network, the learning performance of gradient descent dynamics is dictated by the Hessian matrix of the loss function with respect to the network parameters. We characterize the Hessian eigenspectrum for…

  • The Machine Learning “Advent Calendar” Day 16: Kernel Trick in Excel

    The Machine Learning “Advent Calendar” Day 16: Kernel Trick in Excel Kernel SVM often feels abstract, with kernels, dual formulations, and support vectors. In this article, we take a different path. Starting from Kernel Density Estimation, we build Kernel SVM step by step as a sum of local bells, weighted and selected by hinge loss,…

  • Towards a pretrained deep learning estimator of the Linfoot informational correlation

    Towards a pretrained deep learning estimator of the Linfoot informational correlation arXiv:2512.12358v1 Announce Type: new Abstract: We develop a supervised deep-learning approach to estimate mutual information between two continuous random variables. As labels, we use the Linfoot informational correlation, a transformation of mutual information that has many important properties. Our method is based on ground…

  • The Machine Learning “Advent Calendar” Day 15: SVM in Excel

    The Machine Learning “Advent Calendar” Day 15: SVM in Excel Instead of starting with margins and geometry, this article builds the Support Vector Machine step by step from familiar models. By changing the loss function and reusing regularization, SVM appears naturally as a linear classifier trained by optimization. This perspective unifies logistic regression, SVM, and…

  • Decentralized Computation: The Hidden Principle Behind Deep Learning

    Decentralized Computation: The Hidden Principle Behind Deep Learning Most breakthroughs in deep learning — from simple neural networks to large language models — are built upon a principle that is much older than AI itself: decentralization. Instead of relying on a powerful “central planner” coordinating and commanding the behaviors of other components, modern deep-learning-based AI…

  • How to Maximize Agentic Memory for Continual Learning

    How to Maximize Agentic Memory for Continual Learning Learn how to become an effective engineer with continual learning LLMs The post How to Maximize Agentic Memory for Continual Learning appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

  • The Machine Learning “Advent Calendar” Day 9: LOF in Excel

    The Machine Learning “Advent Calendar” Day 9: LOF in Excel In this article, we explore LOF through three simple steps: distances and neighbors, reachability distances, and the final LOF score. Using tiny datasets, we see how two anomalies can look obvious to us but completely different to different algorithms. This reveals the key idea of…

  • Artificial Intelligence, Machine Learning, Deep Learning, and Generative AI — Clearly Explained

    Artificial Intelligence, Machine Learning, Deep Learning, and Generative AI — Clearly Explained Understanding AI in 2026 — from machine learning to generative models The post Artificial Intelligence, Machine Learning, Deep Learning, and Generative AI — Clearly Explained appeared first on Towards Data Science. Sabrine Bendimerad Go to original source

  • The Machine Learning “Advent Calendar” Day 6: Decision Tree Regressor

    The Machine Learning “Advent Calendar” Day 6: Decision Tree Regressor During the first days of this Machine Learning Advent Calendar, we explored models based on distances. Today, we switch to a completely different way of learning: Decision Trees. With a simple one-feature dataset, we can see how a tree chooses its first split. The idea…

  • Learning Causality for Longitudinal Data

    Learning Causality for Longitudinal Data arXiv:2512.04980v1 Announce Type: new Abstract: This thesis develops methods for causal inference and causal representation learning (CRL) in high-dimensional, time-varying data. The first contribution introduces the Causal Dynamic Variational Autoencoder (CDVAE), a model for estimating Individual Treatment Effects (ITEs) by capturing unobserved heterogeneity in treatment response driven by latent risk…

  • The Machine Learning “Advent Calendar” Day 4: k-Means in Excel

    The Machine Learning “Advent Calendar” Day 4: k-Means in Excel How to implement a training algorithm that finally looks like “real” machine learning The post The Machine Learning “Advent Calendar” Day 4: k-Means in Excel appeared first on Towards Data Science. angela shi Go to original source

  • The Best Data Scientists are Always Learning

    The Best Data Scientists are Always Learning Why continuous learning matters & how to come up with topics to study The post The Best Data Scientists are Always Learning appeared first on Towards Data Science. Jarom Hulet Go to original source

  • The Machine Learning “Advent Calendar” Day 3: GNB, LDA and QDA in Excel

    The Machine Learning “Advent Calendar” Day 3: GNB, LDA and QDA in Excel From local distance to global probability The post The Machine Learning “Advent Calendar” Day 3: GNB, LDA and QDA in Excel appeared first on Towards Data Science. angela shi Go to original source

  • Revisiting Theory of Contrastive Learning for Domain Generalization

    Revisiting Theory of Contrastive Learning for Domain Generalization arXiv:2512.02831v1 Announce Type: new Abstract: Contrastive learning is among the most popular and powerful approaches for self-supervised representation learning, where the goal is to map semantically similar samples close together while separating dissimilar ones in the latent space. Existing theoretical methods assume that downstream task classes are…

  • The Machine Learning Lessons I’ve Learned This Month

    The Machine Learning Lessons I’ve Learned This Month Christmas connections, Copilot’s costs, careful (no-)choices The post The Machine Learning Lessons I’ve Learned This Month appeared first on Towards Data Science. Pascal Janetzky Go to original source

  • Learning, Hacking, and Shipping ML

    Learning, Hacking, and Shipping ML Vyacheslav Efimov on AI hackathons, data science roadmaps, and how AI meaningfully changed day-to-day ML Engineer work The post Learning, Hacking, and Shipping ML appeared first on Towards Data Science. TDS Editors Go to original source

  • The Machine Learning and Deep Learning “Advent Calendar” Series: The Blueprint

    The Machine Learning and Deep Learning “Advent Calendar” Series: The Blueprint Opening the black box of ML models, step by step, directly in Excel The post The Machine Learning and Deep Learning “Advent Calendar” Series: The Blueprint appeared first on Towards Data Science. angela shi Go to original source

  • Learning Triton One Kernel at a Time: Softmax

    Learning Triton One Kernel at a Time: Softmax All you need to know about a fast, readable and PyTorch-ready softmax kernel The post Learning Triton One Kernel at a Time: Softmax appeared first on Towards Data Science. Ryan Pégoud Go to original source

  • Operator Models for Continuous-Time Offline Reinforcement Learning

    Operator Models for Continuous-Time Offline Reinforcement Learning arXiv:2511.10383v1 Announce Type: new Abstract: Continuous-time stochastic processes underlie many natural and engineered systems. In healthcare, autonomous driving, and industrial control, direct interaction with the environment is often unsafe or impractical, motivating offline reinforcement learning from historical data. However, there is limited statistical understanding of the approximation errors…

  • Optimal Control of the Future via Prospective Foraging

    Optimal Control of the Future via Prospective Foraging arXiv:2511.08717v1 Announce Type: new Abstract: Optimal control of the future is the next frontier for AI. Current approaches to this problem are typically rooted in either reinforcement learning or online learning. While powerful, these frameworks for learning are mathematically distinct from Probably Approximately Correct (PAC) learning, which…

  • The Probably Approximately Correct Learning Model in Computational Learning Theory

    The Probably Approximately Correct Learning Model in Computational Learning Theory arXiv:2511.08791v1 Announce Type: new Abstract: This survey paper gives an overview of various known results on learning classes of Boolean functions in Valiant’s Probably Approximately Correct (PAC) learning model and its commonly studied variants. Rocco A. Servedio Go to original source

  • Distributionally Robust Online Markov Game with Linear Function Approximation

    Distributionally Robust Online Markov Game with Linear Function Approximation arXiv:2511.07831v1 Announce Type: new Abstract: The sim-to-real gap, where agents trained in a simulator face significant performance degradation during testing, is a fundamental challenge in reinforcement learning. Extansive works adopt the framework of distributionally robust RL, to learn a policy that acts robustly under worst case…

  • The Three Ages of Data Science: When to Use Traditional Machine Learning, Deep Learning, or an LLM (Explained with One Example)

    The Three Ages of Data Science: When to Use Traditional Machine Learning, Deep Learning, or an LLM (Explained with One Example) A practical use case to describe how the data scientist job changed across three generations of machine learning The post The Three Ages of Data Science: When to Use Traditional Machine Learning, Deep Learning,…

  • Free Learning Paths for Data Analysts, Data Scientists, and Data Engineers – Using 100% Open Resources

    Free Learning Paths for Data Analysts, Data Scientists, and Data Engineers – Using 100% Open Resources Hey, I’m Ryan, and I’ve created https://www.datasciencehive.com/learning-paths A platform offering free, structured learning paths for data enthusiasts and professionals alike. The current paths cover: • Data Analyst: Learn essential skills like SQL, data visualization, and predictive modeling. • Data…

  • Learning Paths for Dynamic Measure Transport: A Control Perspective

    Learning Paths for Dynamic Measure Transport: A Control Perspective arXiv:2511.03797v1 Announce Type: new Abstract: We bring a control perspective to the problem of identifying paths of measures for sampling via dynamic measure transport (DMT). We highlight the fact that commonly used paths may be poor choices for DMT and connect existing methods for learning alternate…

  • The Reinforcement Learning Handbook: A Guide to Foundational Questions

    The Reinforcement Learning Handbook: A Guide to Foundational Questions Simplifying all the concepts required to master reinforcement learning The post The Reinforcement Learning Handbook: A Guide to Foundational Questions appeared first on Towards Data Science. Avishek Biswas Go to original source

  • The Machine Learning Projects Employers Want to See

    The Machine Learning Projects Employers Want to See What machine learning projects will actually get you interviews and jobs The post The Machine Learning Projects Employers Want to See appeared first on Towards Data Science. Egor Howell Go to original source

  • Deep Reinforcement Learning: 0 to 100

    Deep Reinforcement Learning: 0 to 100 Using RL to teach robots to fly a drone The post Deep Reinforcement Learning: 0 to 100 appeared first on Towards Data Science. Vedant Jumle Go to original source

  • The Machine Learning Lessons I’ve Learned This Month

    The Machine Learning Lessons I’ve Learned This Month October 2025: READMEs, MIGs, and movements The post The Machine Learning Lessons I’ve Learned This Month appeared first on Towards Data Science. Pascal Janetzky Go to original source

  • Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization

    Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization arXiv:2510.20883v1 Announce Type: new Abstract: Adversarial training has emerged as a key technique to enhance model robustness against adversarial input perturbations. Many of the existing methods rely on computationally expensive min-max problems that limit their application in practice. We propose a novel formulation of adversarial…

  • Learning Decentralized Routing Policies via Graph Attention-based Multi-Agent Reinforcement Learning in Lunar Delay-Tolerant Networks

    Learning Decentralized Routing Policies via Graph Attention-based Multi-Agent Reinforcement Learning in Lunar Delay-Tolerant Networks arXiv:2510.20436v1 Announce Type: new Abstract: We present a fully decentralized routing framework for multi-robot exploration missions operating under the constraints of a Lunar Delay-Tolerant Network (LDTN). In this setting, autonomous rovers must relay collected data to a lander under intermittent connectivity…

  • Federated Learning and Custom Aggregation Schemes

    Federated Learning and Custom Aggregation Schemes A practical guide to designing and analyzing robust aggregation strategies The post Federated Learning and Custom Aggregation Schemes appeared first on Towards Data Science. Salman Toor Go to original source

  • Personalized Collaborative Learning with Affinity-Based Variance Reduction

    Personalized Collaborative Learning with Affinity-Based Variance Reduction arXiv:2510.16232v1 Announce Type: new Abstract: Multi-agent learning faces a fundamental tension: leveraging distributed collaboration without sacrificing the personalization needed for diverse agents. This tension intensifies when aiming for full personalization while adapting to unknown heterogeneity levels — gaining collaborative speedup when agents are similar, without performance degradation when…

  • Machine Learning Meets Panel Data: What Practitioners Need to Know

    Machine Learning Meets Panel Data: What Practitioners Need to Know How to avoid overestimating machine learning models’ performance, usefulness, and real-world applicability due to hidden data leakage The post Machine Learning Meets Panel Data: What Practitioners Need to Know appeared first on Towards Data Science. Marco Letta Go to original source

  • Learning Triton One Kernel at a Time: Matrix Multiplication

    Learning Triton One Kernel at a Time: Matrix Multiplication Tiled GEMM, GPU memory, coalescing, and much more! The post Learning Triton One Kernel at a Time: Matrix Multiplication appeared first on Towards Data Science. Ryan Pégoud Go to original source

  • Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation

    Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation arXiv:2510.09908v1 Announce Type: new Abstract: The rise of large-scale pretrained models has made it feasible to generate predictive or synthetic features at low cost, raising the question of how to incorporate such surrogate predictions into downstream decision-making. We study this problem in the setting of…

  • Online Matching via Reinforcement Learning: An Expert Policy Orchestration Strategy

    Online Matching via Reinforcement Learning: An Expert Policy Orchestration Strategy arXiv:2510.06515v1 Announce Type: new Abstract: Online matching problems arise in many complex systems, from cloud services and online marketplaces to organ exchange networks, where timely, principled decisions are critical for maintaining high system performance. Traditional heuristics in these settings are simple and interpretable but typically…

  • Refereed Learning

    Refereed Learning arXiv:2510.05440v1 Announce Type: new Abstract: We initiate an investigation of learning tasks in a setting where the learner is given access to two competing provers, only one of which is honest. Specifically, we consider the power of such learners in assessing purported properties of opaque models. Following prior work that considers the power…

  • Learning Multi-Index Models with Hyper-Kernel Ridge Regression

    Learning Multi-Index Models with Hyper-Kernel Ridge Regression arXiv:2510.02532v1 Announce Type: new Abstract: Deep neural networks excel in high-dimensional problems, outperforming models such as kernel methods, which suffer from the curse of dimensionality. However, the theoretical foundations of this success remain poorly understood. We follow the idea that the compositional structure of the learning task is…

  • Temporal-Difference Learning and the Importance of Exploration: An Illustrated Guide

    Temporal-Difference Learning and the Importance of Exploration: An Illustrated Guide Comparing model-free and model-based RL methods on a dynamic grid world The post Temporal-Difference Learning and the Importance of Exploration: An Illustrated Guide appeared first on Towards Data Science. Ryan Pégoud Go to original source

  • The Machine Learning Lessons I’ve Learned This Month

    The Machine Learning Lessons I’ve Learned This Month September 2025: library or self-made, Ditto and Launchbar, reading widely and deeply The post The Machine Learning Lessons I’ve Learned This Month appeared first on Towards Data Science. Pascal Janetzky Go to original source

  • Preparing Video Data for Deep Learning: Introducing Vid Prepper

    Preparing Video Data for Deep Learning: Introducing Vid Prepper A guide to fast video data preprocessing for machine learning The post Preparing Video Data for Deep Learning: Introducing Vid Prepper appeared first on Towards Data Science. Jamie Petherbridge-Conroy Go to original source

  • Learning Triton One Kernel At a Time: Vector Addition

    Learning Triton One Kernel At a Time: Vector Addition The basics of GPU programming, optimisation, and your first Triton kernel The post Learning Triton One Kernel At a Time: Vector Addition appeared first on Towards Data Science. Ryan Pégoud Go to original source

  • SETrLUSI: Stochastic Ensemble Multi-Source Transfer Learning Using Statistical Invariant

    SETrLUSI: Stochastic Ensemble Multi-Source Transfer Learning Using Statistical Invariant arXiv:2509.15593v1 Announce Type: new Abstract: In transfer learning, a source domain often carries diverse knowledge, and different domains usually emphasize different types of knowledge. Different from handling only a single type of knowledge from all domains in traditional transfer learning methods, we introduce an ensemble learning…

  • Learning Rate Should Scale Inversely with High-Order Data Moments in High-Dimensional Online Independent Component Analysis

    Learning Rate Should Scale Inversely with High-Order Data Moments in High-Dimensional Online Independent Component Analysis arXiv:2509.15127v1 Announce Type: new Abstract: We investigate the impact of high-order moments on the learning dynamics of an online Independent Component Analysis (ICA) algorithm under a high-dimensional data model composed of a weighted sum of two non-Gaussian random variables. This…

  • Causal-Symbolic Meta-Learning (CSML): Inducing Causal World Models for Few-Shot Generalization

    Causal-Symbolic Meta-Learning (CSML): Inducing Causal World Models for Few-Shot Generalization arXiv:2509.12387v1 Announce Type: cross Abstract: Modern deep learning models excel at pattern recognition but remain fundamentally limited by their reliance on spurious correlations, leading to poor generalization and a demand for massive datasets. We argue that a key ingredient for human-like intelligence-robust, sample-efficient learning-stems from…

  • Kernel-based Stochastic Approximation Framework for Nonlinear Operator Learning

    Kernel-based Stochastic Approximation Framework for Nonlinear Operator Learning arXiv:2509.11070v1 Announce Type: new Abstract: We develop a stochastic approximation framework for learning nonlinear operators between infinite-dimensional spaces utilizing general Mercer operator-valued kernels. Our framework encompasses two key classes: (i) compact kernels, which admit discrete spectral decompositions, and (ii) diagonal kernels of the form $K(x,x’)=k(x,x’)T$, where $k$…

  • Contrastive Network Representation Learning

    Contrastive Network Representation Learning arXiv:2509.11316v1 Announce Type: new Abstract: Network representation learning seeks to embed networks into a low-dimensional space while preserving the structural and semantic properties, thereby facilitating downstream tasks such as classification, trait prediction, edge identification, and community detection. Motivated by challenges in brain connectivity data analysis that is characterized by subject-specific, high-dimensional,…

  • How to Become a Machine Learning Engineer (Step-by-Step)

    How to Become a Machine Learning Engineer (Step-by-Step) Your one-stop guide to becoming a machine learning engineer The post How to Become a Machine Learning Engineer (Step-by-Step) appeared first on Towards Data Science. Egor Howell Go to original source

  • Kernel VICReg for Self-Supervised Learning in Reproducing Kernel Hilbert Space

    Kernel VICReg for Self-Supervised Learning in Reproducing Kernel Hilbert Space arXiv:2509.07289v1 Announce Type: new Abstract: Self-supervised learning (SSL) has emerged as a powerful paradigm for representation learning by optimizing geometric objectives–such as invariance to augmentations, variance preservation, and feature decorrelation–without requiring labels. However, most existing methods operate in Euclidean space, limiting their ability to capture…

  • The Machine Learning Lessons I’ve Learned This Month

    The Machine Learning Lessons I’ve Learned This Month August 2025: logging, lab notebooks, overnight runs The post The Machine Learning Lessons I’ve Learned This Month appeared first on Towards Data Science. Pascal Janetzky Go to original source

  • Stochastic Gradients under Nuisances

    Stochastic Gradients under Nuisances arXiv:2508.20326v1 Announce Type: new Abstract: Stochastic gradient optimization is the dominant learning paradigm for a variety of scenarios, from classical supervised learning to modern self-supervised learning. We consider stochastic gradient algorithms for learning problems whose objectives rely on unknown nuisance parameters, and establish non-asymptotic convergence guarantees. Our results show that, while…

  • Polynomial Chaos Expansion for Operator Learning

    Polynomial Chaos Expansion for Operator Learning arXiv:2508.20886v1 Announce Type: new Abstract: Operator learning (OL) has emerged as a powerful tool in scientific machine learning (SciML) for approximating mappings between infinite-dimensional functional spaces. One of its main applications is learning the solution operator of partial differential equations (PDEs). While much of the progress in this area…

  • How to Benchmark Classical Machine Learning Workloads on Google Cloud

    How to Benchmark Classical Machine Learning Workloads on Google Cloud Harnessing CPUs for Practical, Cost-Effective Machine Learning The post How to Benchmark Classical Machine Learning Workloads on Google Cloud appeared first on Towards Data Science. Ehssan Khan Go to original source

  • Bayesian Inference and Learning in Nonlinear Dynamical Systems: A Framework for Incorporating Explicit and Implicit Prior Knowledge

    Bayesian Inference and Learning in Nonlinear Dynamical Systems: A Framework for Incorporating Explicit and Implicit Prior Knowledge arXiv:2508.15345v1 Announce Type: new Abstract: Accuracy and generalization capabilities are key objectives when learning dynamical system models. To obtain such models from limited data, current works exploit prior knowledge and assumptions about the system. However, the fusion of…

  • Robust Data Fusion via Subsampling

    Robust Data Fusion via Subsampling arXiv:2508.12048v1 Announce Type: new Abstract: Data fusion and transfer learning are rapidly growing fields that enhance model performance for a target population by leveraging other related data sources or tasks. The challenges lie in the various potential heterogeneities between the target and external data, as well as various practical concerns…

  • Counterfactual Survival Q Learning for Longitudinal Randomized Trials via Buckley James Boosting

    Counterfactual Survival Q Learning for Longitudinal Randomized Trials via Buckley James Boosting arXiv:2508.11060v1 Announce Type: new Abstract: We propose a Buckley James (BJ) Boost Q learning framework for estimating optimal dynamic treatment regimes under right censored survival data, tailored for longitudinal randomized clinical trial settings. The method integrates accelerated failure time models with iterative boosting…

  • Federated Online Learning for Heterogeneous Multisource Streaming Data

    Federated Online Learning for Heterogeneous Multisource Streaming Data arXiv:2508.06652v1 Announce Type: new Abstract: Federated learning has emerged as an essential paradigm for distributed multi-source data analysis under privacy concerns. Most existing federated learning methods focus on the “static” datasets. However, in many real-world applications, data arrive continuously over time, forming streaming datasets. This introduces additional…

  • Statistical Inference for Autoencoder-based Anomaly Detection after Representation Learning-based Domain Adaptation

    Statistical Inference for Autoencoder-based Anomaly Detection after Representation Learning-based Domain Adaptation arXiv:2508.07049v1 Announce Type: new Abstract: Anomaly detection (AD) plays a vital role across a wide range of domains, but its performance might deteriorate when applied to target domains with limited data. Domain Adaptation (DA) offers a solution by transferring knowledge from a related source…

  • Hedging with memory: shallow and deep learning with signatures

    Hedging with memory: shallow and deep learning with signatures arXiv:2508.02759v1 Announce Type: new Abstract: We investigate the use of path signatures in a machine learning context for hedging exotic derivatives under non-Markovian stochastic volatility models. In a deep learning setting, we use signatures as features in feedforward neural networks and show that they outperform LSTMs…

  • Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws

    Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws arXiv:2508.03688v1 Announce Type: new Abstract: We study the optimization and sample complexity of gradient-based training of a two-layer neural network with quadratic activation function in the high-dimensional regime, where the data is generated as $y propto sum_{j=1}^{r}lambda_j sigmaleft(langle boldsymbol{theta_j}, boldsymbol{x}rangleright), boldsymbol{x} sim N(0,boldsymbol{I}_d)$,…

  • Stellar Flare Detection and Prediction Using Clustering and Machine Learning

    Stellar Flare Detection and Prediction Using Clustering and Machine Learning Combining unsupervised clustering with supervised learning to detect and predict stellar flares The post Stellar Flare Detection and Prediction Using Clustering and Machine Learning appeared first on Towards Data Science. Diksha Sen Chaudhury Go to original source

  • Generative AI Models for Learning Flow Maps of Stochastic Dynamical Systems in Bounded Domains

    Generative AI Models for Learning Flow Maps of Stochastic Dynamical Systems in Bounded Domains arXiv:2507.15990v1 Announce Type: new Abstract: Simulating stochastic differential equations (SDEs) in bounded domains, presents significant computational challenges due to particle exit phenomena, which requires accurate modeling of interior stochastic dynamics and boundary interactions. Despite the success of machine learning-based methods in…