Category: aimldsaimlds
-
Support Collapse of Deep Gaussian Processes with Polynomial Kernels for a Wide Regime of Hyperparameters
Support Collapse of Deep Gaussian Processes with Polynomial Kernels for a Wide Regime of Hyperparameters arXiv:2503.12266v1 Announce Type: new Abstract: We analyze the prior that a Deep Gaussian Process with polynomial kernels induces. We observe that, even for relatively small depths, averaging effects occur within such a Deep Gaussian Process and that the prior can…
-
SNPL: Simultaneous Policy Learning and Evaluation for Safe Multi-Objective Policy Improvement
SNPL: Simultaneous Policy Learning and Evaluation for Safe Multi-Objective Policy Improvement arXiv:2503.12760v1 Announce Type: new Abstract: To design effective digital interventions, experimenters face the challenge of learning decision policies that balance multiple objectives using offline data. Often, they aim to develop policies that maximize goal outcomes, while ensuring there are no undesirable changes in guardrail…
-
Nonlinear Principal Component Analysis with Random Bernoulli Features for Process Monitoring
Nonlinear Principal Component Analysis with Random Bernoulli Features for Process Monitoring arXiv:2503.12456v1 Announce Type: new Abstract: The process generates substantial amounts of data with highly complex structures, leading to the development of numerous nonlinear statistical methods. However, most of these methods rely on computations involving large-scale dense kernel matrices. This dependence poses significant challenges in…
-
Learn then Decide: A Learning Approach for Designing Data Marketplaces
Learn then Decide: A Learning Approach for Designing Data Marketplaces arXiv:2503.10773v1 Announce Type: new Abstract: As data marketplaces become increasingly central to the digital economy, it is crucial to design efficient pricing mechanisms that optimize revenue while ensuring fair and adaptive pricing. We introduce the Maximum Auction-to-Posted Price (MAPP) mechanism, a novel two-stage approach that…
-
Exploiting Concavity Information in Gaussian Process Contextual Bandit Optimization
Exploiting Concavity Information in Gaussian Process Contextual Bandit Optimization arXiv:2503.10836v1 Announce Type: new Abstract: The contextual bandit framework is widely used to solve sequential optimization problems where the reward of each decision depends on auxiliary context variables. In settings such as medicine, business, and engineering, the decision maker often possesses additional structural information on the…
-
On the Identifiability of Causal Abstractions
On the Identifiability of Causal Abstractions arXiv:2503.10834v1 Announce Type: new Abstract: Causal representation learning (CRL) enhances machine learning models’ robustness and generalizability by learning structural causal models associated with data-generating processes. We focus on a family of CRL methods that uses contrastive data pairs in the observable space, generated before and after a random, unknown…
-
Mamba time series forecasting with uncertainty propagation
Mamba time series forecasting with uncertainty propagation arXiv:2503.10873v1 Announce Type: new Abstract: State space models, such as Mamba, have recently garnered attention in time series forecasting due to their ability to capture sequence patterns. However, in electricity consumption benchmarks, Mamba forecasts exhibit a mean error of approximately 8%. Similarly, in traffic occupancy benchmarks, the mean…
-
Clustering Items through Bandit Feedback: Finding the Right Feature out of Many
Clustering Items through Bandit Feedback: Finding the Right Feature out of Many arXiv:2503.11209v1 Announce Type: new Abstract: We study the problem of clustering a set of items based on bandit feedback. Each of the $n$ items is characterized by a feature vector, with a possibly large dimension $d$. The items are partitioned into two unknown…
-
Weekly Entering & Transitioning – Thread 17 Mar, 2025 – 24 Mar, 2025
Weekly Entering & Transitioning – Thread 17 Mar, 2025 – 24 Mar, 2025 Welcome to this week’s entering & transitioning thread! This thread is for any questions about getting started, studying, or transitioning into the data science field. Topics include: Learning resources (e.g. books, tutorials, videos) Traditional education (e.g. schools, degrees, electives) Alternative education (e.g.…
-
The Impact of GenAI and Its Implications for Data Scientists
The Impact of GenAI and Its Implications for Data Scientists GenAI systems affect how we work. This general notion is well known. However, we are still unaware of the exact impact of GenAI. For example, how much do these tools affect our work? Do they have a larger impact on certain tasks? What does this…
-
Mastering Hadoop, Part 3: Hadoop Ecosystem: Get the most out of your cluster
Mastering Hadoop, Part 3: Hadoop Ecosystem: Get the most out of your cluster As we have already seen with the basic components (Part 1, Part 2), the Hadoop ecosystem is constantly evolving and being optimized for new applications. As a result, various tools and technologies have developed over time that make Hadoop more powerful and…
-
Mastering Prompt Engineering with Functional Testing: A Systematic Guide to Reliable LLM Outputs
Mastering Prompt Engineering with Functional Testing: A Systematic Guide to Reliable LLM Outputs Creating efficient prompts for large language models often starts as a simple task… but it doesn’t always stay that way. Initially, following basic best practices seems sufficient: adopt the persona of a specialist, write clear instructions, require a specific response format, and…
-
Nine Pico PIO Wats with Rust (Part 2)
Nine Pico PIO Wats with Rust (Part 2) This is Part 2 of an exploration into the unexpected quirks of programming the Raspberry Pi Pico PIO with Micropython. If you missed Part 1, we uncovered four Wats that challenge assumptions about register count, instruction slots, the behavior of pull noblock, and smart yet cheap hardware.…
-
Forget About Cloud Computing. On-Premises Is All the Rage Again
Forget About Cloud Computing. On-Premises Is All the Rage Again Ten years ago, everybody was fascinated by the cloud. It was the new thing, and companies that adopted it rapidly saw tremendous growth. Salesforce, for example, positioned itself as a pioneer of this technology and saw great wins. The tides are turning though. As much…
-
Power Spectrum Signatures of Graphs
Power Spectrum Signatures of Graphs arXiv:2503.09660v1 Announce Type: new Abstract: Point signatures based on the Laplacian operators on graphs, point clouds, and manifolds have become popular tools in machine learning for graphs, clustering, and shape analysis. In this work, we propose a novel point signature, the power spectrum signature, a measure on $mathbb{R}$ defined as…
-
Explainable Bayesian deep learning through input-skip Latent Binary Bayesian Neural Networks
Explainable Bayesian deep learning through input-skip Latent Binary Bayesian Neural Networks arXiv:2503.10496v1 Announce Type: new Abstract: Modeling natural phenomena with artificial neural networks (ANNs) often provides highly accurate predictions. However, ANNs often suffer from over-parameterization, complicating interpretation and raising uncertainty issues. Bayesian neural networks (BNNs) address the latter by representing weights as probability distributions, allowing…
-
Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures
Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures arXiv:2503.10576v1 Announce Type: new Abstract: A common approach to generative modeling is to split model-fitting into two blocks: define first how to sample noise (e.g. Gaussian) and choose next what to do with it (e.g. using a single map or flows). We…
-
Technical Insights and Legal Considerations for Advancing Federated Learning in Bioinformatics
Technical Insights and Legal Considerations for Advancing Federated Learning in Bioinformatics arXiv:2503.09649v1 Announce Type: cross Abstract: Federated learning leverages data across institutions to improve clinical discovery while complying with data-sharing restrictions and protecting patient privacy. As the evolution of biobanks in genetics and systems biology has proved, accessing more extensive and varied data pools leads…
-
Bags of Projected Nearest Neighbours: Competitors to Random Forests?
Bags of Projected Nearest Neighbours: Competitors to Random Forests? arXiv:2503.09651v1 Announce Type: cross Abstract: In this paper we introduce a simple and intuitive adaptive k nearest neighbours classifier, and explore its utility within the context of bootstrap aggregating (“bagging”). The approach is based on finding discriminant subspaces which are computationally efficient to compute, and are…
-
Essential Review Papers on Physics-Informed Neural Networks: A Curated Guide for Practitioners
Essential Review Papers on Physics-Informed Neural Networks: A Curated Guide for Practitioners Staying on top of a fast-growing research field is never easy. I face this challenge firsthand as a practitioner in Physics-Informed Neural Networks (PINNs). New papers, be they algorithmic advancements or cutting-edge applications, are published at an accelerating pace by both academia and…
-
Anatomy of a Parquet File
Anatomy of a Parquet File In recent years, Parquet has become a standard format for data storage in Big Data ecosystems. Its column-oriented format offers several advantages: Faster query execution when only a subset of columns is being processed Quick calculation of statistics across all data Reduced storage volume thanks to efficient compression When combined…
-
Fourier Transform Applications in Literary Analysis
Fourier Transform Applications in Literary Analysis Poetry is often seen as a pure art form, ranging from the rigid structure of a haiku to the fluid, unconstrained nature of free-verse poetry. In analysing these works, though, to what extent can mathematics and Data Analysis be used to glean meaning from this free-flowing literature? Of course,…
-
Mastering Hadoop, Part 2: Getting Hands-On — Setting Up and Scaling Hadoop
Mastering Hadoop, Part 2: Getting Hands-On — Setting Up and Scaling Hadoop Now that we’ve explored Hadoop’s role and relevance, it’s time to show you how it works under the hood and how you can start working with it. To start, we are breaking down Hadoop’s core components — HDFS for storage, MapReduce for processing,…
-
Are You Still Using LoRA to Fine-Tune Your LLM?
Are You Still Using LoRA to Fine-Tune Your LLM? LoRA (Low Rank Adaptation – arxiv.org/abs/2106.09685) is a popular technique for fine-tuning Large Language Models (LLMs) on the cheap. But 2024 has seen an explosion of new parameter-efficient fine-tuning techniques, an alphabet soup of LoRA alternatives: SVF, SVFT, MiLoRA, PiSSA, LoRA-XS … And most are based…
-
Learning Pareto manifolds in high dimensions: How can regularization help?
Learning Pareto manifolds in high dimensions: How can regularization help? arXiv:2503.08849v1 Announce Type: new Abstract: Simultaneously addressing multiple objectives is becoming increasingly important in modern machine learning. At the same time, data is often high-dimensional and costly to label. For a single objective such as prediction risk, conventional regularization techniques are known to improve generalization…
-
A Deep Bayesian Nonparametric Framework for Robust Mutual Information Estimation
A Deep Bayesian Nonparametric Framework for Robust Mutual Information Estimation arXiv:2503.08902v1 Announce Type: new Abstract: Mutual Information (MI) is a crucial measure for capturing dependencies between variables, but exact computation is challenging in high dimensions with intractable likelihoods, impacting accuracy and robustness. One idea is to use an auxiliary neural network to train an MI…
-
Risk-sensitive Bandits: Arm Mixture Optimality and Regret-efficient Algorithms
Risk-sensitive Bandits: Arm Mixture Optimality and Regret-efficient Algorithms arXiv:2503.08896v1 Announce Type: new Abstract: This paper introduces a general framework for risk-sensitive bandits that integrates the notions of risk-sensitive objectives by adopting a rich class of distortion riskmetrics. The introduced framework subsumes the various existing risk-sensitive models. An important and hitherto unknown observation is that for…
-
Self-Consistent Equation-guided Neural Networks for Censored Time-to-Event Data
Self-Consistent Equation-guided Neural Networks for Censored Time-to-Event Data arXiv:2503.09097v1 Announce Type: new Abstract: In survival analysis, estimating the conditional survival function given predictors is often of interest. There is a growing trend in the development of deep learning methods for analyzing censored time-to-event data, especially when dealing with high-dimensional predictors that are complexly interrelated. Many…
-
Addressing pitfalls in implicit unobserved confounding synthesis using explicit block hierarchical ancestral sampling
Addressing pitfalls in implicit unobserved confounding synthesis using explicit block hierarchical ancestral sampling arXiv:2503.09194v1 Announce Type: new Abstract: Unbiased data synthesis is crucial for evaluating causal discovery algorithms in the presence of unobserved confounding, given the scarcity of real-world datasets. A common approach, implicit parameterization, encodes unobserved confounding by modifying the off-diagonal entries of the…
-
Probabilistic Shielding for Safe Reinforcement Learning
Probabilistic Shielding for Safe Reinforcement Learning arXiv:2503.07671v1 Announce Type: new Abstract: In real-life scenarios, a Reinforcement Learning (RL) agent aiming to maximise their reward, must often also behave in a safe manner, including at training time. Thus, much attention in recent years has been given to Safe RL, where an agent aims to learn an…
-
Personalized Convolutional Dictionary Learning of Physiological Time Series
Personalized Convolutional Dictionary Learning of Physiological Time Series arXiv:2503.07687v1 Announce Type: new Abstract: Human physiological signals tend to exhibit both global and local structures: the former are shared across a population, while the latter reflect inter-individual variability. For instance, kinetic measurements of the gait cycle during locomotion present common characteristics, although idiosyncrasies may be observed…
-
Uncertainty quantification and posterior sampling for network reconstruction
Uncertainty quantification and posterior sampling for network reconstruction arXiv:2503.07736v1 Announce Type: new Abstract: Network reconstruction is the task of inferring the unseen interactions between elements of a system, based only on their behavior or dynamics. This inverse problem is in general ill-posed, and admits many solutions for the same observation. Nevertheless, the vast majority of…
-
Cost-Aware Optimal Pairwise Pure Exploration
Cost-Aware Optimal Pairwise Pure Exploration arXiv:2503.07877v1 Announce Type: new Abstract: Pure exploration is one of the fundamental problems in multi-armed bandits (MAB). However, existing works mostly focus on specific pure exploration tasks, without a holistic view of the general pure exploration problem. This work fills this gap by introducing a versatile framework to study pure…
-
Pure Exploration with Feedback Graphs
Pure Exploration with Feedback Graphs arXiv:2503.07824v1 Announce Type: new Abstract: We study the sample complexity of pure exploration in an online learning problem with a feedback graph. This graph dictates the feedback available to the learner, covering scenarios between full-information, pure bandit feedback, and settings with no feedback on the chosen action. While variants of…
-
7 Powerful DBeaver Tips and Tricks to Improve Your SQL Workflow
7 Powerful DBeaver Tips and Tricks to Improve Your SQL Workflow DBeaver is the most powerful open-source SQL IDE, but there are several features people don’t know about. In this post, I will share with you several features to speed up your workflow, with zero fluff. I’ve learned these as I’m currently digging deeper into…
-
How to Switch from Data Analyst to Data Scientist
How to Switch from Data Analyst to Data Scientist Are you a Data Analyst looking to break into data science? If so, this post is for you. Many people start in analytics because it generally has a lower barrier to entry, but as they gain experience, they realize they want to take on more technical…
-
Experiments Illustrated: Can $1 Change Behavior More Than $100?
Experiments Illustrated: Can $1 Change Behavior More Than $100? I currently lead a small data team at a small tech company. With everything small, we have a lot of autonomy over what, when, and how we run experiments. In this series, I’m opening the vault from our years of experimenting, each story highlighting a key…
-
Mastering Hadoop, Part 1: Installation, Configuration, and Modern Big Data Strategies
Mastering Hadoop, Part 1: Installation, Configuration, and Modern Big Data Strategies Nowadays, a large amount of data is collected on the internet, which is why companies are faced with the challenge of being able to store, process, and analyze these volumes efficiently. Hadoop is an open-source framework from the Apache Software Foundation and has become…
-
How to Develop Complex DAX Expressions
How to Develop Complex DAX Expressions At some point or another, any Power BI developer must write complex Dax expressions to analyze data. But nobody tells you how to do it. What’s the process for doing it? What is the best way to do it, and how supportive can a development process be? These are the questions…
-
Fixing the Pitfalls of Probabilistic Time-Series Forecasting Evaluation by Kernel Quadrature
Fixing the Pitfalls of Probabilistic Time-Series Forecasting Evaluation by Kernel Quadrature arXiv:2503.06079v1 Announce Type: new Abstract: Despite the significance of probabilistic time-series forecasting models, their evaluation metrics often involve intractable integrations. The most widely used metric, the continuous ranked probability score (CRPS), is a strictly proper scoring function; however, its computation requires approximation. We found…
-
On Statistical Estimation of Edge-Reinforced Random Walks
On Statistical Estimation of Edge-Reinforced Random Walks arXiv:2503.06115v1 Announce Type: new Abstract: Reinforced random walks (RRWs), including vertex-reinforced random walks (VRRWs) and edge-reinforced random walks (ERRWs), model random walks where the transition probabilities evolve based on prior visitation history~cite{mgr, fmk, tarres, volkov}. These models have found applications in various areas, such as network representation learning~cite{xzzs},…
-
Double Debiased Machine Learning for Mediation Analysis with Continuous Treatments
Double Debiased Machine Learning for Mediation Analysis with Continuous Treatments arXiv:2503.06156v1 Announce Type: new Abstract: Uncovering causal mediation effects is of significant value to practitioners seeking to isolate the direct treatment effect from the potential mediated effect. We propose a double machine learning (DML) algorithm for mediation analysis that supports continuous treatments. To estimate the…
-
Bayesian Optimization for Robust Identification of Ornstein-Uhlenbeck Model
Bayesian Optimization for Robust Identification of Ornstein-Uhlenbeck Model arXiv:2503.06381v1 Announce Type: new Abstract: This paper deals with the identification of the stochastic Ornstein-Uhlenbeck (OU) process error model, which is characterized by an inverse time constant, and the unknown variances of the process and observation noises. Although the availability of the explicit expression of the log-likelihood…
-
Platform-Mesh, Hub and Spoke, and Centralised | 3 Types of data team
Platform-Mesh, Hub and Spoke, and Centralised | 3 Types of data team Introduction In the “ever rapidly changing landscape of Data and AI” (!), understanding data and AI architecture has never been more critical. However something many leaders overlook is the importance of data team structure. While many of you reading this probably identify as the data…
-
Linear Regression in Time Series: Sources of Spurious Regression
Linear Regression in Time Series: Sources of Spurious Regression 1. Introduction It’s pretty clear that most of our work will be automated by AI in the future. This will be possible because many researchers and professionals are working hard to make their work available online. These contributions not only help us understand fundamental concepts but…
-
From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities
From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities Introduction: Can AI really distinguish dog breeds like human experts? One day while taking a walk, I saw a fluffy white puppy and wondered, Is that a Bichon Frise or a Maltese? No matter how closely I looked, they seemed almost identical.…
-
Experiments Illustrated: How Random Assignment Saved Us $1M in Marketing Spend
Experiments Illustrated: How Random Assignment Saved Us $1M in Marketing Spend Running cool experiments is easily one of my favorite parts of working in data science. Most experiments don’t deliver big wins, so the winners make for fun stories. We’ve had a few of these at IntelyCare, and I’m sharing each story in a way…
-
Experiments Illustrated: How We Optimized Premium Listings on Our Nursing Job Board
Experiments Illustrated: How We Optimized Premium Listings on Our Nursing Job Board Running experiments is a task that often falls to data scientists. If that’s you, congrats! It can be a rewarding and high-impact area of work, but also requires tools found outside the typical ML-heavy data science curriculum. Even with the best tools, only…
-
A Practical Introduction to Kernel Discrepancies: MMD, HSIC & KSD
A Practical Introduction to Kernel Discrepancies: MMD, HSIC & KSD arXiv:2503.04820v1 Announce Type: new Abstract: This article provides a practical introduction to kernel discrepancies, focusing on the Maximum Mean Discrepancy (MMD), the Hilbert-Schmidt Independence Criterion (HSIC), and the Kernel Stein Discrepancy (KSD). Various estimators for these discrepancies are presented, including the commonly-used V-statistics and U-statistics,…
-
Boltzmann convolutions and Welford mean-variance layers with an application to time series forecasting and classification
Boltzmann convolutions and Welford mean-variance layers with an application to time series forecasting and classification arXiv:2503.04956v1 Announce Type: new Abstract: In this paper we propose a novel problem called the ForeClassing problem where the loss of a classification decision is only observed at a future time point after the classification decision has to be made.…
-
A characterization of sample adaptivity in UCB data
A characterization of sample adaptivity in UCB data arXiv:2503.04855v1 Announce Type: new Abstract: We characterize a joint CLT of the number of pulls and the sample mean reward of the arms in a stochastic two-armed bandit environment under UCB algorithms. Several implications of this result are in place: (1) a nonstandard CLT of the number…
-
Empirical Bound Information-Directed Sampling for Norm-Agnostic Bandits
Empirical Bound Information-Directed Sampling for Norm-Agnostic Bandits arXiv:2503.05098v1 Announce Type: new Abstract: Information-directed sampling (IDS) is a powerful framework for solving bandit problems which has shown strong results in both Bayesian and frequentist settings. However, frequentist IDS, like many other bandit algorithms, requires that one have prior knowledge of a (relatively) tight upper bound on…
-
Topology-Aware Conformal Prediction for Stream Networks
Topology-Aware Conformal Prediction for Stream Networks arXiv:2503.04981v1 Announce Type: new Abstract: Stream networks, a unique class of spatiotemporal graphs, exhibit complex directional flow constraints and evolving dependencies, making uncertainty quantification a critical yet challenging task. Traditional conformal prediction methods struggle in this setting due to the need for joint predictions across multiple interdependent locations and…
-
Weekly Entering & Transitioning – Thread 10 Mar, 2025 – 17 Mar, 2025
Weekly Entering & Transitioning – Thread 10 Mar, 2025 – 17 Mar, 2025 Welcome to this week’s entering & transitioning thread! This thread is for any questions about getting started, studying, or transitioning into the data science field. Topics include: Learning resources (e.g. books, tutorials, videos) Traditional education (e.g. schools, degrees, electives) Alternative education (e.g.…
-
Custom Training Pipeline for Object Detection Models
Custom Training Pipeline for Object Detection Models What if you want to write the whole object detection training pipeline from scratch, so you can understand each step and be able to customize it? That’s what I set out to do. I examined several well-known object detection pipelines and designed one that best suits my needs…
-
Comprehensive Guide to Dependency Management in Python
Comprehensive Guide to Dependency Management in Python Introduction When learning Python, many beginners focus solely on the language and its libraries while completely ignoring virtual environments. As a result, managing Python projects can become a mess: dependencies installed for different projects may have conflicting versions, leading to compatibility issues. Even when I studied Python, nobody…
-
Using GPT-4 for Personal Styling
Using GPT-4 for Personal Styling I’ve always been fascinated by Fashion—collecting unique pieces and trying to blend them in my own way. But let’s just say my closet was more of a work-in-progress avalanche than a curated wonderland. Every time I tried to add something new, I risked toppling my carefully balanced piles. Why this…
-
Image Captioning, Transformer Mode On
Image Captioning, Transformer Mode On Introduction In my previous article, I discussed one of the earliest Deep Learning approaches for image captioning. If you’re interested in reading it, you can find the link to that article at the end of this one. Today, I would like to talk about Image Captioning again, but this time…
-
When You Just Can’t Decide on a Single Action
When You Just Can’t Decide on a Single Action In Game Theory, the players typically have to make assumptions about the other players’ actions. What will the other player do? Will he use rock, paper or scissors? You never know, but in some cases, you might have an idea of the probability of some actions…
-
Reheated Gradient-based Discrete Sampling for Combinatorial Optimization
Reheated Gradient-based Discrete Sampling for Combinatorial Optimization arXiv:2503.04047v1 Announce Type: new Abstract: Recently, gradient-based discrete sampling has emerged as a highly efficient, general-purpose solver for various combinatorial optimization (CO) problems, achieving performance comparable to or surpassing the popular data-driven approaches. However, we identify a critical issue in these methods, which we term ”wandering in contours”.…
-
Conformal Prediction with Upper and Lower Bound Models
Conformal Prediction with Upper and Lower Bound Models arXiv:2503.04071v1 Announce Type: new Abstract: This paper studies a Conformal Prediction (CP) methodology for building prediction intervals in a regression setting, given only deterministic lower and upper bounds on the target variable. It proposes a new CP mechanism (CPUL) that goes beyond post-processing by adopting a model…
-
Generalization in Federated Learning: A Conditional Mutual Information Framework
Generalization in Federated Learning: A Conditional Mutual Information Framework arXiv:2503.04091v1 Announce Type: new Abstract: Federated Learning (FL) is a widely adopted privacy-preserving distributed learning framework, yet its generalization performance remains less explored compared to centralized learning. In FL, the generalization error consists of two components: the out-of-sample gap, which measures the gap between the empirical…
-
Learning Causal Response Representations through Direct Effect Analysis
Learning Causal Response Representations through Direct Effect Analysis arXiv:2503.04358v1 Announce Type: new Abstract: We propose a novel approach for learning causal response representations. Our method aims to extract directions in which a multidimensional outcome is most directly caused by a treatment variable. By bridging conditional independence testing with causal representation learning, we formulate an optimisation…
-
How to Spot and Prevent Model Drift Before it Impacts Your Business
How to Spot and Prevent Model Drift Before it Impacts Your Business Despite the AI hype, many tech companies still rely heavily on machine learning to power critical applications, from personalized recommendations to fraud detection. I’ve seen firsthand how undetected drifts can result in significant costs — missed fraud detection, lost revenue, and suboptimal business…
-
Applications of Entropy in Data Analysis and Machine Learning: A Review
Applications of Entropy in Data Analysis and Machine Learning: A Review arXiv:2503.02921v1 Announce Type: new Abstract: Since its origin in the thermodynamics of the 19th century, the concept of entropy has also permeated other fields of physics and mathematics, such as Classical and Quantum Statistical Mechanics, Information Theory, Probability Theory, Ergodic Theory and the Theory…
-
LAPD: Langevin-Assisted Bayesian Active Learning for Physical Discovery
LAPD: Langevin-Assisted Bayesian Active Learning for Physical Discovery arXiv:2503.02983v1 Announce Type: new Abstract: Discovering physical laws from data is a fundamental challenge in scientific research, particularly when high-quality data are scarce or costly to obtain. Traditional methods for identifying dynamical systems often struggle with noise sensitivity, inefficiency in data usage, and the inability to quantify…
-
PAC Learning with Improvements
PAC Learning with Improvements arXiv:2503.03184v1 Announce Type: new Abstract: One of the most basic lower bounds in machine learning is that in nearly any nontrivial setting, it takes $textit{at least}$ $1/epsilon$ samples to learn to error $epsilon$ (and more, if the classifier being learned is complex). However, suppose that data points are agents who have…
-
Convergence Rates for Softmax Gating Mixture of Experts
Convergence Rates for Softmax Gating Mixture of Experts arXiv:2503.03213v1 Announce Type: new Abstract: Mixture of experts (MoE) has recently emerged as an effective framework to advance the efficiency and scalability of machine learning models by softly dividing complex tasks among multiple specialized sub-models termed experts. Central to the success of MoE is an adaptive softmax…
-
Exploring specialization and sensitivity of convolutional neural networks in the context of simultaneous image augmentations
Exploring specialization and sensitivity of convolutional neural networks in the context of simultaneous image augmentations arXiv:2503.03283v1 Announce Type: new Abstract: Drawing parallels with the way biological networks are studied, we adapt the treatment–control paradigm to explainable artificial intelligence research and enrich it through multi-parametric input alterations. In this study, we propose a framework for investigating…
-
One-Tailed Vs. Two-Tailed Tests
One-Tailed Vs. Two-Tailed Tests Introduction If you’ve ever analyzed data using built-in t-test functions, such as those in R or SciPy, here’s a question for you: have you ever adjusted the default setting for the alternative hypothesis? If your answer is no—or if you’re not even sure what this means—then this blog post is for…
-
Kubernetes — Understanding and Utilizing Probes Effectively
Kubernetes — Understanding and Utilizing Probes Effectively Introduction Let’s talk about Kubernetes probes and why they matter in your deployments. When managing production-facing containerized applications, even small optimizations can have enormous benefits. Aiming to reduce deployment times, making your applications better react to scaling events, and managing the running pods healthiness requires fine-tuning your container…
-
Overcome Failing Document Ingestion & RAG Strategies with Agentic Knowledge Distillation
Overcome Failing Document Ingestion & RAG Strategies with Agentic Knowledge Distillation Introduction Many generative AI use cases still revolve around Retrieval Augmented Generation (RAG), yet consistently fall short of user expectations. Despite the growing body of research on RAG improvements and even adding Agents into the process, many solutions still fail to return exhaustive results,…
-
Generative AI Is Declarative
Generative AI Is Declarative ChatGPT launched in 2022 and kicked off the Generative Ai boom. In the two years since, academics, technologists, and armchair experts have written libraries worth of articles on the technical underpinnings of generative AI and about the potential capabilities of both current and future generative AI models. Surprisingly little has been…
-
Mathematical Foundation of Interpretable Equivariant Surrogate Models
Mathematical Foundation of Interpretable Equivariant Surrogate Models arXiv:2503.01942v1 Announce Type: new Abstract: This paper introduces a rigorous mathematical framework for neural network explainability, and more broadly for the explainability of equivariant operators called Group Equivariant Operators (GEOs) based on Group Equivariant Non-Expansive Operators (GENEOs) transformations. The central concept involves quantifying the distance between GEOs by…
-
Gradient-free stochastic optimization for additive models
Gradient-free stochastic optimization for additive models arXiv:2503.02131v1 Announce Type: new Abstract: We address the problem of zero-order optimization from noisy observations for an objective function satisfying the Polyak-{L}ojasiewicz or the strong convexity condition. Additionally, we assume that the objective function has an additive structure and satisfies a higher-order smoothness property, characterized by the H”older family…
-
Quantifying Overfitting along the Regularization Path for Two-Part-Code MDL in Supervised Classification
Quantifying Overfitting along the Regularization Path for Two-Part-Code MDL in Supervised Classification arXiv:2503.02110v1 Announce Type: new Abstract: We provide a complete characterization of the entire regularization curve of a modified two-part-code Minimum Description Length (MDL) learning rule for binary classification, based on an arbitrary prior or description language. citet{GL} previously established the lack of asymptotic…
-
Online Inference for Quantiles by Constant Learning-Rate Stochastic Gradient Descent
Online Inference for Quantiles by Constant Learning-Rate Stochastic Gradient Descent arXiv:2503.02178v1 Announce Type: new Abstract: This paper proposes an online inference method of the stochastic gradient descent (SGD) with a constant learning rate for quantile loss functions with theoretical guarantees. Since the quantile loss function is neither smooth nor strongly convex, we view such SGD…
-
Decentralized Reinforcement Learning for Multi-Agent Multi-Resource Allocation via Dynamic Cluster Agreements
Decentralized Reinforcement Learning for Multi-Agent Multi-Resource Allocation via Dynamic Cluster Agreements arXiv:2503.02437v1 Announce Type: new Abstract: This paper addresses the challenge of allocating heterogeneous resources among multiple agents in a decentralized manner. Our proposed method, LGTC-IPPO, builds upon Independent Proximal Policy Optimization (IPPO) by integrating dynamic cluster consensus, a mechanism that allows agents to form…
-
Deep Research by OpenAI: A Practical Test of AI-Powered Literature Review
Deep Research by OpenAI: A Practical Test of AI-Powered Literature Review “Conduct a comprehensive literature review on the state-of-the-art in Machine Learning and energy consumption. […]” With this prompt, I tested the new Deep Research function, which has been integrated into the OpenAI o3 reasoning model since the end of February — and conducted a state-of-the-art literature…
-
Mastering 1:1s as a Data Scientist: From Status Updates to Career Growth
Mastering 1:1s as a Data Scientist: From Status Updates to Career Growth I have been a data team manager for six months, and my team has grown from three to five. I wrote about my initial manager experiences back in November. In this article, I want to talk about something that is more essential to…
-
Practical SQL Puzzles That Will Level Up Your Skill
Practical SQL Puzzles That Will Level Up Your Skill There are some Sql patterns that, once you know them, you start seeing them everywhere. The solutions to the puzzles that I will show you today are actually very simple SQL queries, but understanding the concept behind them will surely unlock new solutions to the queries…
-
The Urgent Need for Intrinsic Alignment Technologies for Responsible Agentic AI
The Urgent Need for Intrinsic Alignment Technologies for Responsible Agentic AI Advancements in agentic artificial intelligence (AI) promise to bring significant opportunities to individuals and businesses in all sectors. However, as AI agents become more autonomous, they may use scheming behavior or break rules to achieve their functional goals. This can lead to the machine…
-
Approaching the Harm of Gradient Attacks While Only Flipping Labels
Approaching the Harm of Gradient Attacks While Only Flipping Labels arXiv:2503.00140v1 Announce Type: new Abstract: Availability attacks are one of the strongest forms of training-phase attacks in machine learning, making the model unusable. While prior work in distributed ML has demonstrated such effect via gradient attacks and, more recently, data poisoning, we ask: can similar…
-
An interpretation of the Brownian bridge as a physics-informed prior for the Poisson equation
An interpretation of the Brownian bridge as a physics-informed prior for the Poisson equation arXiv:2503.00213v1 Announce Type: new Abstract: Physics-informed machine learning is one of the most commonly used methods for fusing physical knowledge in the form of partial differential equations with experimental data. The idea is to construct a loss function where the physical…
-
Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits
Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits arXiv:2503.00273v1 Announce Type: new Abstract: We study the evolution of information in interactive decision making through the lens of a stochastic multi-armed bandit problem. Focusing on a fundamental example where a unique optimal arm outperforms the rest by a fixed margin, we…
-
LNUCB-TA: Linear-nonlinear Hybrid Bandit Learning with Temporal Attention
LNUCB-TA: Linear-nonlinear Hybrid Bandit Learning with Temporal Attention arXiv:2503.00387v1 Announce Type: new Abstract: Existing contextual multi-armed bandit (MAB) algorithms fail to effectively capture both long-term trends and local patterns across all arms, leading to suboptimal performance in environments with rapidly changing reward structures. They also rely on static exploration rates, which do not dynamically adjust…
-
Generalization Bounds for Equivariant Networks on Markov Data
Generalization Bounds for Equivariant Networks on Markov Data arXiv:2503.00292v1 Announce Type: new Abstract: Equivariant neural networks play a pivotal role in analyzing datasets with symmetry properties, particularly in complex data structures. However, integrating equivariance with Markov properties presents notable challenges due to the inherent dependencies within such data. Previous research has primarily concentrated on establishing…
-
How to Train LLMs to “Think” (o1 & DeepSeek-R1)
How to Train LLMs to “Think” (o1 & DeepSeek-R1) In September 2024, OpenAI released its o1 model, trained on large-scale reinforcement learning, giving it “advanced reasoning” capabilities. Unfortunately, the details of how they pulled this off were never shared publicly. Today, however, DeepSeek (an AI research lab) has replicated this reasoning behavior and published the…
-
Generative AI and Civic Institutions
Generative AI and Civic Institutions Different sectors, different goals Recent events have got me thinking about AI as it relates to our civic institutions — think government, education, public libraries, and so on. We often forget that civic and governmental organizations are inherently deeply different from private companies and profit-making enterprises. They exist to enable…
-
LLM + RAG: Creating an AI-Powered File Reader Assistant
LLM + RAG: Creating an AI-Powered File Reader Assistant Introduction AI is everywhere. It is hard not to interact at least once a day with a Large Language Model (LLM). The chatbots are here to stay. They’re in your apps, they help you write better, they compose emails, they read emails…well, they do a lot.…
-
Data Science: From School to Work, Part II
Data Science: From School to Work, Part II In my previous article, I highlighted the importance of effective project management in Python development. Now, let’s shift our focus to the code itself and explore how to write clean, maintainable code — an essential practice in professional and collaborative environments. Readability & Maintainability: Well-structured code is easier to…
-
Avoidable and Unavoidable Randomness in GPT-4o
Avoidable and Unavoidable Randomness in GPT-4o Of course there is randomness in GPT-4o’s outputs. After all, the model samples from a probability distribution when choosing each token. But what I didn’t understand was that those very probabilities themselves are not deterministic. Even with consistent prompts, fixed seeds, and temperature set to zero, GPT-4o still introduces…
-
Transfer Learning through Enhanced Sufficient Representation: Enriching Source Domain Knowledge with Target Data
Transfer Learning through Enhanced Sufficient Representation: Enriching Source Domain Knowledge with Target Data arXiv:2502.20414v1 Announce Type: new Abstract: Transfer learning is an important approach for addressing the challenges posed by limited data availability in various applications. It accomplishes this by transferring knowledge from well-established source domains to a less familiar target domain. However, traditional transfer…
-
Efficient Risk-sensitive Planning via Entropic Risk Measures
Efficient Risk-sensitive Planning via Entropic Risk Measures arXiv:2502.20423v1 Announce Type: new Abstract: Risk-sensitive planning aims to identify policies maximizing some tail-focused metrics in Markov Decision Processes (MDPs). Such an optimization task can be very costly for the most widely used and interpretable metrics such as threshold probabilities or (Conditional) Values at Risk. Indeed, previous work…
-
Amortized Conditional Independence Testing
Amortized Conditional Independence Testing arXiv:2502.20925v1 Announce Type: new Abstract: Testing for the conditional independence structure in data is a fundamental and critical task in statistics and machine learning, which finds natural applications in causal discovery – a highly relevant problem to many scientific disciplines. Existing methods seek to design explicit test statistics that quantify the…
-
Learning Dynamics of Deep Linear Networks Beyond the Edge of Stability
Learning Dynamics of Deep Linear Networks Beyond the Edge of Stability arXiv:2502.20531v1 Announce Type: new Abstract: Deep neural networks trained using gradient descent with a fixed learning rate $eta$ often operate in the regime of “edge of stability” (EOS), where the largest eigenvalue of the Hessian equilibrates about the stability threshold $2/eta$. In this work,…
-
Post-Hoc Uncertainty Quantification in Pre-Trained Neural Networks via Activation-Level Gaussian Processes
Post-Hoc Uncertainty Quantification in Pre-Trained Neural Networks via Activation-Level Gaussian Processes arXiv:2502.20966v1 Announce Type: new Abstract: Uncertainty quantification in neural networks through methods such as Dropout, Bayesian neural networks and Laplace approximations is either prone to underfitting or computationally demanding, rendering these approaches impractical for large-scale datasets. In this work, we address these shortcomings by…
-
Weekly Entering & Transitioning – Thread 03 Mar, 2025 – 10 Mar, 2025
Weekly Entering & Transitioning – Thread 03 Mar, 2025 – 10 Mar, 2025 Welcome to this week’s entering & transitioning thread! This thread is for any questions about getting started, studying, or transitioning into the data science field. Topics include: Learning resources (e.g. books, tutorials, videos) Traditional education (e.g. schools, degrees, electives) Alternative education (e.g.…