Tag: linear

  • Linear Regression with Unknown Truncation Beyond Gaussian Features

    Linear Regression with Unknown Truncation Beyond Gaussian Features arXiv:2602.12534v1 Announce Type: new Abstract: In truncated linear regression, samples $(x,y)$ are shown only when the outcome $y$ falls inside a certain survival set $S^star$ and the goal is to estimate the unknown $d$-dimensional regressor $w^star$. This problem has a long history of study in Statistics and…

  • Physics-informed Gaussian Process Regression in Solving Eigenvalue Problem of Linear Operators

    Physics-informed Gaussian Process Regression in Solving Eigenvalue Problem of Linear Operators arXiv:2601.06462v1 Announce Type: new Abstract: Applying Physics-Informed Gaussian Process Regression to the eigenvalue problem $(mathcal{L}-lambda)u = 0$ poses a fundamental challenge, where the null source term results in a trivial predictive mean and a degenerate marginal likelihood. Drawing inspiration from system identification, we construct…

  • Mastering Non-Linear Data: A Guide to Scikit-Learn’s SplineTransformer

    Mastering Non-Linear Data: A Guide to Scikit-Learn’s SplineTransformer Forget stiff lines and wild polynomials. Discover why Splines are the “Goldilocks” of feature engineering, offering the perfect balance of flexibility and discipline for non-linear data using Scikit-Learn’s SplineTransformer. The post Mastering Non-Linear Data: A Guide to Scikit-Learn’s SplineTransformer appeared first on Towards Data Science. Gustavo Santos…

  • The Interplay of Statistics and Noisy Optimization: Learning Linear Predictors with Random Data Weights

    The Interplay of Statistics and Noisy Optimization: Learning Linear Predictors with Random Data Weights arXiv:2512.10188v1 Announce Type: new Abstract: We analyze gradient descent with randomly weighted data points in a linear regression model, under a generic weighting distribution. This includes various forms of stochastic gradient descent, importance sampling, but also extends to weighting distributions with…

  • The Machine Learning “Advent Calendar” Day 11: Linear Regression in Excel

    The Machine Learning “Advent Calendar” Day 11: Linear Regression in Excel Linear Regression looks simple, but it introduces the core ideas of modern machine learning: loss functions, optimization, gradients, scaling, and interpretation. In this article, we rebuild Linear Regression in Excel, compare the closed-form solution with Gradient Descent, and see how the coefficients evolve step…

  • Symmetric Linear Dynamical Systems are Learnable from Few Observations

    Symmetric Linear Dynamical Systems are Learnable from Few Observations arXiv:2512.05337v1 Announce Type: new Abstract: We consider the problem of learning the parameters of a $N$-dimensional stochastic linear dynamics under both full and partial observations from a single trajectory of time $T$. We introduce and analyze a new estimator that achieves a small maximum element-wise error…

  • Recurrent Neural Networks with Linear Structures for Electricity Price Forecasting

    Recurrent Neural Networks with Linear Structures for Electricity Price Forecasting arXiv:2512.04690v1 Announce Type: new Abstract: We present a novel recurrent neural network architecture designed explicitly for day-ahead electricity price forecasting, aimed at improving short-term decision-making and operational management in energy systems. Our combined forecasting model embeds linear structures, such as expert models and Kalman filters,…

  • Decomposing Direct and Indirect Biases in Linear Models under Demographic Parity Constraint

    Decomposing Direct and Indirect Biases in Linear Models under Demographic Parity Constraint arXiv:2511.11294v1 Announce Type: new Abstract: Linear models are widely used in high-stakes decision-making due to their simplicity and interpretability. Yet when fairness constraints such as demographic parity are introduced, their effects on model coefficients, and thus on how predictive bias is distributed across…

  • Multiple Linear Regression Explained Simply (Part 1)

    Multiple Linear Regression Explained Simply (Part 1) The math behind fitting a plane instead of a line. The post Multiple Linear Regression Explained Simply (Part 1) appeared first on Towards Data Science. Nikhil Dasari Go to original source

  • What Optimization Terminologies for Linear Programming Really Mean

    What Optimization Terminologies for Linear Programming Really Mean Understanding the duality of optimization problem, primal to dual conversion, and the optimality conditions for linear problems. The post What Optimization Terminologies for Linear Programming Really Mean appeared first on Towards Data Science. Himalaya Bir Shrestha Go to original source

  • Bayesian Modeling and Estimation of Linear Time-Variant Systems using Neural Networks and Gaussian Processes

    Bayesian Modeling and Estimation of Linear Time-Variant Systems using Neural Networks and Gaussian Processes arXiv:2507.12878v1 Announce Type: new Abstract: The identification of Linear Time-Variant (LTV) systems from input-output data is a fundamental yet challenging ill-posed inverse problem. This work introduces a unified Bayesian framework that models the system’s impulse response, $h(t, tau)$, as a stochastic…

  • Optimal and Practical Batched Linear Bandit Algorithm

    Optimal and Practical Batched Linear Bandit Algorithm arXiv:2507.08438v1 Announce Type: new Abstract: We study the linear bandit problem under limited adaptivity, known as the batched linear bandit. While existing approaches can achieve near-optimal regret in theory, they are often computationally prohibitive or underperform in practice. We propose texttt{BLAE}, a novel batched algorithm that integrates arm…

  • Animating Linear Transformations with Quiver

    Animating Linear Transformations with Quiver A useful tool in your quiver The post Animating Linear Transformations with Quiver appeared first on Towards Data Science. Artemij Lehmann Go to original source

  • A Framework for Non-Linear Attention via Modern Hopfield Networks

    A Framework for Non-Linear Attention via Modern Hopfield Networks arXiv:2506.11043v1 Announce Type: new Abstract: In this work we propose an energy functional along the lines of Modern Hopfield Networks (MNH), the stationary points of which correspond to the attention due to Vaswani et al. [12], thus unifying both frameworks. The minima of this landscape form…

  • Multiple Linear Regression Analysis

    Multiple Linear Regression Analysis Implementation of multiple linear regression on real data: Assumption checks, model evaluation, and interpretation of results using Python. The post Multiple Linear Regression Analysis appeared first on Towards Data Science. JUNIOR JUMBONG Go to original source

  • A Linear Approach to Data Poisoning

    A Linear Approach to Data Poisoning arXiv:2505.15175v1 Announce Type: new Abstract: We investigate the theoretical foundations of data poisoning attacks in machine learning models. Our analysis reveals that the Hessian with respect to the input serves as a diagnostic tool for detecting poisoning, exhibiting spectral signatures that characterize compromised datasets. We use random matrix theory…

  • Inference for max-linear Bayesian networks with noise

    Inference for max-linear Bayesian networks with noise arXiv:2505.00229v1 Announce Type: new Abstract: Max-Linear Bayesian Networks (MLBNs) provide a powerful framework for causal inference in extreme-value settings; we consider MLBNs with noise parameters with a given topology in terms of the max-plus algebra by taking its logarithm. Then, we show that an estimator of a parameter…

  • Foundations of Safe Online Reinforcement Learning in the Linear Quadratic Regulator: $sqrt{T}$-Regret

    Foundations of Safe Online Reinforcement Learning in the Linear Quadratic Regulator: $sqrt{T}$-Regret arXiv:2504.18657v1 Announce Type: new Abstract: Understanding how to efficiently learn while adhering to safety constraints is essential for using online reinforcement learning in practical applications. However, proving rigorous regret bounds for safety-constrained reinforcement learning is difficult due to the complex interaction between safety,…

  • Differentially Private Geodesic and Linear Regression

    Differentially Private Geodesic and Linear Regression arXiv:2504.11304v1 Announce Type: new Abstract: In statistical applications it has become increasingly common to encounter data structures that live on non-linear spaces such as manifolds. Classical linear regression, one of the most fundamental methodologies of statistical learning, captures the relationship between an independent variable and a response variable which…

  • An Incremental Non-Linear Manifold Approximation Method

    An Incremental Non-Linear Manifold Approximation Method arXiv:2504.09068v1 Announce Type: new Abstract: Analyzing high-dimensional data presents challenges due to the “curse of dimensionality”, making computations intensive. Dimension reduction techniques, categorized as linear or non-linear, simplify such data. Non-linear methods are particularly essential for efficiently visualizing and processing complex data structures in interactive and graphical applications. This…

  • Linear Programming: Managing Multiple Targets with Goal Programming

    Linear Programming: Managing Multiple Targets with Goal Programming This is the sixth (and likely last) part of a Linear Programming series I’ve been writing. With the core concepts covered by the prior articles, this article focuses on goal programming which is a less frequent linear programming (LP) use case. Goal programming is a specific linear…

  • Analyzing the Role of Permutation Invariance in Linear Mode Connectivity

    Analyzing the Role of Permutation Invariance in Linear Mode Connectivity arXiv:2503.06001v1 Announce Type: new Abstract: It was empirically observed in Entezari et al. (2021) that when accounting for the permutation invariance of neural networks, there is likely no loss barrier along the linear interpolation between two SGD solutions — a phenomenon known as linear mode…

  • Exact Recovery of Sparse Binary Vectors from Generalized Linear Measurements

    Exact Recovery of Sparse Binary Vectors from Generalized Linear Measurements arXiv:2502.16008v1 Announce Type: new Abstract: We consider the problem of exact recovery of a $k$-sparse binary vector from generalized linear measurements (such as logistic regression). We analyze the linear estimation algorithm (Plan, Vershynin, Yudovina, 2017), and also show information theoretic lower bounds on the number…

  • Optimal Algorithms in Linear Regression under Covariate Shift: On the Importance of Precondition

    Optimal Algorithms in Linear Regression under Covariate Shift: On the Importance of Precondition arXiv:2502.09047v1 Announce Type: new Abstract: A common pursuit in modern statistical learning is to attain satisfactory generalization out of the source data distribution (OOD). In theory, the challenge remains unsolved even under the canonical setting of covariate shift for the linear model.…

  • TD(0) Learning converges for Polynomial mixing and non-linear functions

    TD(0) Learning converges for Polynomial mixing and non-linear functions arXiv:2502.05706v1 Announce Type: new Abstract: Theoretical work on Temporal Difference (TD) learning has provided finite-sample and high-probability guarantees for data generated from Markov chains. However, these bounds typically require linear function approximation, instance-dependent step sizes, algorithmic modifications, and restrictive mixing rates. We present theoretical findings for…

  • Statistical Verification of Linear Classifiers

    Statistical Verification of Linear Classifiers arXiv:2501.14430v1 Announce Type: new Abstract: We propose a homogeneity test closely related to the concept of linear separability between two samples. Using the test one can answer the question whether a linear classifier is merely “random” or effectively captures differences between two classes. We focus on establishing upper bounds for…

  • Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks

    Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks arXiv:2501.09137v1 Announce Type: cross Abstract: We study the gradient descent (GD) dynamics of a depth-2 linear neural network with a single input and output. We show that GD converges at an explicit linear rate to a global minimum of the training…

  • Statistical Learnability of Strategic Linear Classifiers: A Proof Walkthrough

    Statistical Learnability of Strategic Linear Classifiers: A Proof Walkthrough With the help of an intricate geometric construction, we can prove that instance-wise cost functions quickly drive SVC to infinity. In the previous article in this series, we examined the concept of strategic VC dimension (SVC) and its connection to the Fundamental Theorem of Strategic Learning.…

  • Mastering the Basics: How Linear Regression Unlocks the Secrets of Complex Models

    Mastering the Basics: How Linear Regression Unlocks the Secrets of Complex Models Full explanation on Linear Regression and how it learns The Crane Stance. Public Domain image from Openverse Just like Mr. Miyagi taught young Daniel LaRusso karate through repetitive simple chores, which ultimately transformed him into the Karate Kid, mastering foundational algorithms like linear regression…

  • A Bird’s-Eye View of Linear Algebra: Orthonormal Matrices

    A Bird’s-Eye View of Linear Algebra: Orthonormal Matrices Orthonormal matrices: the most elegant matrices in all of linear algebra. Continue reading on Towards Data Science » Rohit Pandey Go to original source

  • Asymptotics of Linear Regression with Linearly Dependent Data

    Asymptotics of Linear Regression with Linearly Dependent Data arXiv:2412.03702v1 Announce Type: new Abstract: In this paper we study the asymptotics of linear regression in settings where the covariates exhibit a linear dependency structure, departing from the standard assumption of independence. We model the covariates using stochastic processes with spatio-temporal covariance and analyze the performance of…