Category: aimldsaimlds
-
How to Build An AI Agent with Function Calling and GPT-5
How to Build An AI Agent with Function Calling and GPT-5 How an AI agent works: a step-by-step guide The post How to Build An AI Agent with Function Calling and GPT-5 appeared first on Towards Data Science. Ayoola Olafenwa Go to original source
-
How to Use Frontier Vision LLMs: Qwen3-VL
How to Use Frontier Vision LLMs: Qwen3-VL Learn how to apply VLMs to advanced document understanding tasks The post How to Use Frontier Vision LLMs: Qwen3-VL appeared first on Towards Data Science. Eivind Kjosbakken Go to original source
-
How I Tailored the Resume That Landed Me $100K+ Data Science and ML Offers
How I Tailored the Resume That Landed Me $100K+ Data Science and ML Offers How to write a data science and machine learning resume that actually lands jobs. The post How I Tailored the Resume That Landed Me $100K+ Data Science and ML Offers appeared first on Towards Data Science. Egor Howell Go to original…
-
Things I Learned by Participating in GenAI Hackathons Over the Past 6 Months
Things I Learned by Participating in GenAI Hackathons Over the Past 6 Months Sharing my two cents from the building in public journey so far The post Things I Learned by Participating in GenAI Hackathons Over the Past 6 Months appeared first on Towards Data Science. Parul Pandey Go to original source
-
From Universal Approximation Theorem to Tropical Geometry of Multi-Layer Perceptrons
From Universal Approximation Theorem to Tropical Geometry of Multi-Layer Perceptrons arXiv:2510.15012v1 Announce Type: new Abstract: We revisit the Universal Approximation Theorem(UAT) through the lens of the tropical geometry of neural networks and introduce a constructive, geometry-aware initialization for sigmoidal multi-layer perceptrons (MLPs). Tropical geometry shows that Rectified Linear Unit (ReLU) networks admit decision functions with…
-
Reliable data clustering with Bayesian community detection
Reliable data clustering with Bayesian community detection arXiv:2510.15013v1 Announce Type: new Abstract: From neuroscience and genomics to systems biology and ecology, researchers rely on clustering similarity data to uncover modular structure. Yet widely used clustering methods, such as hierarchical clustering, k-means, and WGCNA, lack principled model selection, leaving them susceptible to noise. A common workaround…
-
The Coverage Principle: How Pre-training Enables Post-Training
The Coverage Principle: How Pre-training Enables Post-Training arXiv:2510.15020v1 Announce Type: new Abstract: Language models demonstrate remarkable abilities when pre-trained on large text corpora and fine-tuned for specific tasks, but how and why pre-training shapes the success of the final model remains poorly understood. Notably, although pre-training success is often quantified by cross entropy loss, cross-entropy…
-
The Tree-SNE Tree Exists
The Tree-SNE Tree Exists arXiv:2510.15014v1 Announce Type: new Abstract: The clustering and visualisation of high-dimensional data is a ubiquitous task in modern data science. Popular techniques include nonlinear dimensionality reduction methods like t-SNE or UMAP. These methods face the `scale-problem’ of clustering: when dealing with the MNIST dataset, do we want to distinguish different digits…
-
The Minimax Lower Bound of Kernel Stein Discrepancy Estimation
The Minimax Lower Bound of Kernel Stein Discrepancy Estimation arXiv:2510.15058v1 Announce Type: new Abstract: Kernel Stein discrepancies (KSDs) have emerged as a powerful tool for quantifying goodness-of-fit over the last decade, featuring numerous successful applications. To the best of our knowledge, all existing KSD estimators with known rate achieve $sqrt n$-convergence. In this work, we…
-
Weekly Entering & Transitioning – Thread 20 Oct, 2025 – 27 Oct, 2025
Weekly Entering & Transitioning – Thread 20 Oct, 2025 – 27 Oct, 2025 Welcome to this week’s entering & transitioning thread! This thread is for any questions about getting started, studying, or transitioning into the data science field. Topics include: Learning resources (e.g. books, tutorials, videos) Traditional education (e.g. schools, degrees, electives) Alternative education (e.g.…
-
How to perform synthetic control for multiple treated units? What are the things to keep in mind while performing it? Also, what python package i could use? Also have questions about metrics
How to perform synthetic control for multiple treated units? What are the things to keep in mind while performing it? Also, what python package i could use? Also have questions about metrics Hi I have never done Synthetic control, i want to work on a small project (like small data. My task is to find…
-
Anyone else tired of the non-stop LLM hype in personal and/or professional life?
Anyone else tired of the non-stop LLM hype in personal and/or professional life? I have a complex relationship with LLMs. At work, I’m told they’re the best thing since the invention of the internet, electricity, or [insert other trite comparison here], and that I’ll lose my job to people who do use them if I…
-
I built a project and I thought I might share it with the group
I built a project and I thought I might share it with the group Disclaimer: It’s UK focused. Hi everyone, When I was looking to buy a house, a big annoyance I had was that I couldn’t easily tell if I was getting value for money. Although, in my opinion, any property is expensive as…
-
Transformers, Time Series, and the Myth of Permutation Invariance
Transformers, Time Series, and the Myth of Permutation Invariance There’s a common misconception in ML/DL that Transformers shouldn’t be used for forecasting because attention is permutation-invariant. Latest evidence shows the opposite, such as Google’s latest model, where the experiments show the model performs just as well with or without positional embeddings. You can find an…
-
Conceptual Frameworks for Data Science Projects
Conceptual Frameworks for Data Science Projects An overview of common framework types and a simple process for building custom frameworks The post Conceptual Frameworks for Data Science Projects appeared first on Towards Data Science. Chinmay Kakatkar Go to original source
-
How to Build Guardrails for Effective Agents
How to Build Guardrails for Effective Agents Learn how to set up effective guardrails to enforce desired behaviour from your agents The post How to Build Guardrails for Effective Agents appeared first on Towards Data Science. Eivind Kjosbakken Go to original source
-
Can We Save the AI Economy?
Can We Save the AI Economy? And do we want to? The post Can We Save the AI Economy? appeared first on Towards Data Science. Stephanie Kirmer Go to original source
-
Python 3.14 and the End of the GIL
Python 3.14 and the End of the GIL Exploring the opportunities and challenges of a GIL-free Python The post Python 3.14 and the End of the GIL appeared first on Towards Data Science. Thomas Reid Go to original source
-
Machine Learning Meets Panel Data: What Practitioners Need to Know
Machine Learning Meets Panel Data: What Practitioners Need to Know How to avoid overestimating machine learning models’ performance, usefulness, and real-world applicability due to hidden data leakage The post Machine Learning Meets Panel Data: What Practitioners Need to Know appeared first on Towards Data Science. Marco Letta Go to original source
-
How to Classify Lung Cancer Subtype from DNA Copy Numbers Using PyTorch
How to Classify Lung Cancer Subtype from DNA Copy Numbers Using PyTorch A step-by-step introduction to understanding cancer from the perspective of a data scientist. The post How to Classify Lung Cancer Subtype from DNA Copy Numbers Using PyTorch appeared first on Towards Data Science. Adam Streck Go to original source
-
How I Used Machine Learning to Predict 41% of Project Delays Before They Happened
How I Used Machine Learning to Predict 41% of Project Delays Before They Happened How data science can help project managers anticipate risks and save time The post How I Used Machine Learning to Predict 41% of Project Delays Before They Happened appeared first on Towards Data Science. Yassin Zehar Go to original source
-
Exact Dynamics of Multi-class Stochastic Gradient Descent
Exact Dynamics of Multi-class Stochastic Gradient Descent arXiv:2510.14074v1 Announce Type: new Abstract: We develop a framework for analyzing the training and learning rate dynamics on a variety of high- dimensional optimization problems trained using one-pass stochastic gradient descent (SGD) with data generated from multiple anisotropic classes. We give exact expressions for a large class of…
-
deFOREST: Fusing Optical and Radar satellite data for Enhanced Sensing of Tree-loss
deFOREST: Fusing Optical and Radar satellite data for Enhanced Sensing of Tree-loss arXiv:2510.14092v1 Announce Type: new Abstract: In this paper we develop a deforestation detection pipeline that incorporates optical and Synthetic Aperture Radar (SAR) data. A crucial component of the pipeline is the construction of anomaly maps of the optical data, which is done using…
-
High-Dimensional BWDM: A Robust Nonparametric Clustering Validation Index for Large-Scale Data
High-Dimensional BWDM: A Robust Nonparametric Clustering Validation Index for Large-Scale Data arXiv:2510.14145v1 Announce Type: new Abstract: Determining the appropriate number of clusters in unsupervised learning is a central problem in statistics and data science. Traditional validity indices such as Calinski-Harabasz, Silhouette, and Davies-Bouldin-depend on centroid-based distances and therefore degrade in high-dimensional or contaminated data. This…
-
A novel Information-Driven Strategy for Optimal Regression Assessment
A novel Information-Driven Strategy for Optimal Regression Assessment arXiv:2510.14222v1 Announce Type: new Abstract: In Machine Learning (ML), a regression algorithm aims to minimize a loss function based on data. An assessment method in this context seeks to quantify the discrepancy between the optimal response for an input-output system and the estimate produced by a learned…
-
Personalized federated learning, Row-wise fusion regularization, Multivariate modeling, Sparse estimation
Personalized federated learning, Row-wise fusion regularization, Multivariate modeling, Sparse estimation arXiv:2510.14413v1 Announce Type: new Abstract: We study personalized federated learning for multivariate responses where client models are heterogeneous yet share variable-level structure. Existing entry-wise penalties ignore cross-response dependence, while matrix-wise fusion over-couples clients. We propose a Sparse Row-wise Fusion (SROF) regularizer that clusters row vectors…
-
Feature Detection, Part 1: Image Derivatives, Gradients, and Sobel Operator
Feature Detection, Part 1: Image Derivatives, Gradients, and Sobel Operator Applying calculus fundamentals to computer vision for edge detection The post Feature Detection, Part 1: Image Derivatives, Gradients, and Sobel Operator appeared first on Towards Data Science. Vyacheslav Efimov Go to original source
-
Stop Feeling Lost : How to Master ML System Design
Stop Feeling Lost : How to Master ML System Design What machine learning system design is and how to prepare for it The post Stop Feeling Lost : How to Master ML System Design appeared first on Towards Data Science. Egor Howell Go to original source
-
A Beginner’s Guide to Robotics with Python
A Beginner’s Guide to Robotics with Python Build 3D simulations with PyBullet The post A Beginner’s Guide to Robotics with Python appeared first on Towards Data Science. Mauro Di Pietro Go to original source
-
How to Evaluate Retrieval Quality in RAG Pipelines: Precision@k, Recall@k, and F1@k
How to Evaluate Retrieval Quality in RAG Pipelines: Precision@k, Recall@k, and F1@k In my previous posts, I have walked you through putting together a very basic RAG pipeline in Python, as well as chunking large text documents. We’ve also looked into how documents are transformed into embeddings, allowing us to quickly search for similar documents…
-
Efficient Inference for Coupled Hidden Markov Models in Continuous Time and Discrete Space
Efficient Inference for Coupled Hidden Markov Models in Continuous Time and Discrete Space arXiv:2510.12916v1 Announce Type: new Abstract: Systems of interacting continuous-time Markov chains are a powerful model class, but inference is typically intractable in high dimensional settings. Auxiliary information, such as noisy observations, is typically only available at discrete times, and incorporating it via…
-
Simplicial Gaussian Models: Representation and Inference
Simplicial Gaussian Models: Representation and Inference arXiv:2510.12983v1 Announce Type: new Abstract: Probabilistic graphical models (PGMs) are powerful tools for representing statistical dependencies through graphs in high-dimensional systems. However, they are limited to pairwise interactions. In this work, we propose the simplicial Gaussian model (SGM), which extends Gaussian PGM to simplicial complexes. SGM jointly models random…
-
Conformal Inference for Open-Set and Imbalanced Classification
Conformal Inference for Open-Set and Imbalanced Classification arXiv:2510.13037v1 Announce Type: new Abstract: This paper presents a conformal prediction method for classification in highly imbalanced and open-set settings, where there are many possible classes and not all may be represented in the data. Existing approaches require a finite, known label space and typically involve random sample…
-
A Multi-dimensional Semantic Surprise Framework Based on Low-Entropy Semantic Manifolds for Fine-Grained Out-of-Distribution Detection
A Multi-dimensional Semantic Surprise Framework Based on Low-Entropy Semantic Manifolds for Fine-Grained Out-of-Distribution Detection arXiv:2510.13093v1 Announce Type: new Abstract: Out-of-Distribution (OOD) detection is a cornerstone for the safe deployment of AI systems in the open world. However, existing methods treat OOD detection as a binary classification problem, a cognitive flattening that fails to distinguish between…
-
Gaussian Certified Unlearning in High Dimensions: A Hypothesis Testing Approach
Gaussian Certified Unlearning in High Dimensions: A Hypothesis Testing Approach arXiv:2510.13094v1 Announce Type: new Abstract: Machine unlearning seeks to efficiently remove the influence of selected data while preserving generalization. Significant progress has been made in low dimensions $(p ll n)$, but high dimensions pose serious theoretical challenges as standard optimization assumptions of $Omega(1)$ strong convexity…
-
First Principles Thinking for Data Scientists
First Principles Thinking for Data Scientists The mindset that turns good data scientists into great ones The post First Principles Thinking for Data Scientists appeared first on Towards Data Science. Greg Rafferty Go to original source
-
Prompt Engineering for Time-Series Analysis with Large Language Models
Prompt Engineering for Time-Series Analysis with Large Language Models Part 1: Prompts for Core Strategies in Time-Series The post Prompt Engineering for Time-Series Analysis with Large Language Models appeared first on Towards Data Science. Sara Nobrega Go to original source
-
Beyond Requests: Why httpx is the Modern HTTP Client You Need (Sometimes)
Beyond Requests: Why httpx is the Modern HTTP Client You Need (Sometimes) A comprehensive comparison of these two Python libraries The post Beyond Requests: Why httpx is the Modern HTTP Client You Need (Sometimes) appeared first on Towards Data Science. Thomas Reid Go to original source
-
How to Build Tools for AI Agents
How to Build Tools for AI Agents Learn how to design and build effective tools to be used by AI Agents The post How to Build Tools for AI Agents appeared first on Towards Data Science. Eivind Kjosbakken Go to original source
-
Dimension-Free Minimax Rates for Learning Pairwise Interactions in Attention-Style Models
Dimension-Free Minimax Rates for Learning Pairwise Interactions in Attention-Style Models arXiv:2510.11789v1 Announce Type: new Abstract: We study the convergence rate of learning pairwise interactions in single-layer attention-style models, where tokens interact through a weight matrix and a non-linear activation function. We prove that the minimax rate is $M^{-frac{2beta}{2beta+1}}$ with $M$ being the sample size, depending…
-
On Thompson Sampling and Bilateral Uncertainty in Additive Bayesian Optimization
On Thompson Sampling and Bilateral Uncertainty in Additive Bayesian Optimization arXiv:2510.11792v1 Announce Type: new Abstract: In Bayesian Optimization (BO), additive assumptions can mitigate the twin difficulties of modeling and searching a complex function in high dimension. However, common acquisition functions, like the Additive Lower Confidence Bound, ignore pairwise covariances between dimensions, which we’ll call textit{bilateral…
-
Active Subspaces in Infinite Dimension
Active Subspaces in Infinite Dimension arXiv:2510.11871v1 Announce Type: new Abstract: Active subspace analysis uses the leading eigenspace of the gradient’s second moment to conduct supervised dimension reduction. In this article, we extend this methodology to real-valued functionals on Hilbert space. We define an operator which coincides with the active subspace matrix when applied to a…
-
High-Probability Bounds For Heterogeneous Local Differential Privacy
High-Probability Bounds For Heterogeneous Local Differential Privacy arXiv:2510.11895v1 Announce Type: new Abstract: We study statistical estimation under local differential privacy (LDP) when users may hold heterogeneous privacy levels and accuracy must be guaranteed with high probability. Departing from the common in-expectation analyses, and for one-dimensional and multi-dimensional mean estimation problems, we develop finite sample upper…
-
Simplifying Optimal Transport through Schatten-$p$ Regularization
Simplifying Optimal Transport through Schatten-$p$ Regularization arXiv:2510.11910v1 Announce Type: new Abstract: We propose a new general framework for recovering low-rank structure in optimal transport using Schatten-$p$ norm regularization. Our approach extends existing methods that promote sparse and interpretable transport maps or plans, while providing a unified and principled family of convex programs that encourage low-dimensional…
-
Learning Triton One Kernel at a Time: Matrix Multiplication
Learning Triton One Kernel at a Time: Matrix Multiplication Tiled GEMM, GPU memory, coalescing, and much more! The post Learning Triton One Kernel at a Time: Matrix Multiplication appeared first on Towards Data Science. Ryan Pégoud Go to original source
-
Building A Successful Relationship With Stakeholders
Building A Successful Relationship With Stakeholders Show your value by moving beyond the technical The post Building A Successful Relationship With Stakeholders appeared first on Towards Data Science. Kristopher McGlinchey Go to original source
-
Why AI Still Can’t Replace Analysts: A Predictive Maintenance Example
Why AI Still Can’t Replace Analysts: A Predictive Maintenance Example Learn about the limitations of AI in analytics through the example of bearing vibration data analysis The post Why AI Still Can’t Replace Analysts: A Predictive Maintenance Example appeared first on Towards Data Science. Illia Smoliienko Go to original source
-
Human Won’t Replace Python
Human Won’t Replace Python Why vibe-coding is not a step up from “classic” coding — and why it matters The post Human Won’t Replace Python appeared first on Towards Data Science. Elisha Rosensweig Go to original source
-
Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation
Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation arXiv:2510.09908v1 Announce Type: new Abstract: The rise of large-scale pretrained models has made it feasible to generate predictive or synthetic features at low cost, raising the question of how to incorporate such surrogate predictions into downstream decision-making. We study this problem in the setting of…
-
Calibrating Generative Models
Calibrating Generative Models arXiv:2510.10020v1 Announce Type: new Abstract: Generative models frequently suffer miscalibration, wherein class probabilities and other statistics of the sampling distribution deviate from desired values. We frame calibration as a constrained optimization problem and seek the closest model in Kullback-Leibler divergence satisfying calibration constraints. To address the intractability of imposing these constraints exactly,…
-
Kernel Treatment Effects with Adaptively Collected Data
Kernel Treatment Effects with Adaptively Collected Data arXiv:2510.10245v1 Announce Type: new Abstract: Adaptive experiments improve efficiency by adjusting treatment assignments based on past outcomes, but this adaptivity breaks the i.i.d. assumptions that underpins classical asymptotics. At the same time, many questions of interest are distributional, extending beyond average effects. Kernel treatment effects (KTE) provide a…
-
Neural variational inference for cutting feedback during uncertainty propagation
Neural variational inference for cutting feedback during uncertainty propagation arXiv:2510.10268v1 Announce Type: new Abstract: In many scientific applications, uncertainty of estimates from an earlier (upstream) analysis needs to be propagated in subsequent (downstream) Bayesian analysis, without feedback. Cutting feedback methods, also termed cut-Bayes, achieve this by constructing a cut-posterior distribution that prevents backward information flow.…
-
On some practical challenges of conformal prediction
On some practical challenges of conformal prediction arXiv:2510.10324v1 Announce Type: new Abstract: Conformal prediction is a model-free machine learning method for creating prediction regions with a guaranteed coverage probability level. However, a data scientist often faces three challenges in practice: (i) the determination of a conformal prediction region is only approximate, jeopardizing the finite-sample validity…
-
How to Spin Up a Project Structure with Cookiecutter
How to Spin Up a Project Structure with Cookiecutter If you’re anything like me, “procrastination” might as well be your middle name. There’s always that nagging hesitation before starting a new project. Just thinking about setting up the project structure, creating documentation, or writing a decent README is enough to trigger yawns. It feels like…
-
A Representer Theorem for Hawkes Processes via Penalized Least Squares Minimization
A Representer Theorem for Hawkes Processes via Penalized Least Squares Minimization arXiv:2510.08916v1 Announce Type: new Abstract: The representer theorem is a cornerstone of kernel methods, which aim to estimate latent functions in reproducing kernel Hilbert spaces (RKHSs) in a nonparametric manner. Its significance lies in converting inherently infinite-dimensional optimization problems into finite-dimensional ones over dual…
-
Gradient-Guided Furthest Point Sampling for Robust Training Set Selection
Gradient-Guided Furthest Point Sampling for Robust Training Set Selection arXiv:2510.08906v1 Announce Type: new Abstract: Smart training set selections procedures enable the reduction of data needs and improves predictive robustness in machine learning problems relevant to chemistry. We introduce Gradient Guided Furthest Point Sampling (GGFPS), a simple extension of Furthest Point Sampling (FPS) that leverages molecular…
-
Mirror Flow Matching with Heavy-Tailed Priors for Generative Modeling on Convex Domains
Mirror Flow Matching with Heavy-Tailed Priors for Generative Modeling on Convex Domains arXiv:2510.08929v1 Announce Type: new Abstract: We study generative modeling on convex domains using flow matching and mirror maps, and identify two fundamental challenges. First, standard log-barrier mirror maps induce heavy-tailed dual distributions, leading to ill-posed dynamics. Second, coupling with Gaussian priors performs poorly…
-
Distributionally robust approximation property of neural networks
Distributionally robust approximation property of neural networks arXiv:2510.09177v1 Announce Type: new Abstract: The universal approximation property uniformly with respect to weakly compact families of measures is established for several classes of neural networks. To that end, we prove that these neural networks are dense in Orlicz spaces, thereby extending classical universal approximation theorems even beyond…
-
A unified Bayesian framework for adversarial robustness
A unified Bayesian framework for adversarial robustness arXiv:2510.09288v1 Announce Type: new Abstract: The vulnerability of machine learning models to adversarial attacks remains a critical security challenge. Traditional defenses, such as adversarial training, typically robustify models by minimizing a worst-case loss. However, these deterministic approaches do not account for uncertainty in the adversary’s attack. While stochastic…
-
Weekly Entering & Transitioning – Thread 13 Oct, 2025 – 20 Oct, 2025
Weekly Entering & Transitioning – Thread 13 Oct, 2025 – 20 Oct, 2025 Welcome to this week’s entering & transitioning thread! This thread is for any questions about getting started, studying, or transitioning into the data science field. Topics include: Learning resources (e.g. books, tutorials, videos) Traditional education (e.g. schools, degrees, electives) Alternative education (e.g.…
-
From data scientist to a new role ?
From data scientist to a new role ? Hi everyone, I’m 25, currently working as a Data Scientist & AI Engineer at a large Space company in Europe, with ~2.5 years of experience. My focus has been on LLM R&D, RAG pipelines, satellite telemetry anomaly detection, surrogate modeling, and some FPGA-compatible ML for onboard systems.…
-
Clustring very different values
Clustring very different values I have 200 observations, 3 variables ( somewhat correlated).For v1, the median is 300 dollars. but I have a really long tail. when I do the histogram, 100 obs are near 0 and the others form a really long tail, even when I cap outliers. what is best way to cluster?…
-
What should I ask my potential managers when choosing between two jobs?
What should I ask my potential managers when choosing between two jobs? I’m deciding between two mid-level data science offers at large tech companies. These are more applied scientist type of roles than analytics. Comp and level are similar, so I’m really trying to figure out which one will set me up for a stronger…
-
Free data set that links company to type of activity?
Free data set that links company to type of activity? Best ressource to classify for example: walmart. food ( top classification) supermarket ( sub classification). I work with european companies also. thanks. submitted by /u/Due-Duty961 [link] [comments] /u/Due-Duty961 Go to original source
-
10 Data + AI Observations for Fall 2025
10 Data + AI Observations for Fall 2025 What’s happening—and what’s next— for data and AI at the close of 2025. The post 10 Data + AI Observations for Fall 2025 appeared first on Towards Data Science. Barr Moses Go to original source
-
Dreaming in Blocks — MineWorld, the Minecraft World Model
Dreaming in Blocks — MineWorld, the Minecraft World Model Explaining “MineWorld: A real-time and open-source interactive world model on Minecraft” in simple terms. The post Dreaming in Blocks — MineWorld, the Minecraft World Model appeared first on Towards Data Science. Youssef Farag Go to original source
-
Evaluating and Learning Optimal Dynamic Treatment Regimes under Truncation by Death
Evaluating and Learning Optimal Dynamic Treatment Regimes under Truncation by Death arXiv:2510.07501v1 Announce Type: new Abstract: Truncation by death, a prevalent challenge in critical care, renders traditional dynamic treatment regime (DTR) evaluation inapplicable due to ill-defined potential outcomes. We introduce a principal stratification-based method, focusing on the always-survivor value function. We derive a semiparametrically efficient,…
-
From Data to Rewards: a Bilevel Optimization Perspective on Maximum Likelihood Estimation
From Data to Rewards: a Bilevel Optimization Perspective on Maximum Likelihood Estimation arXiv:2510.07624v1 Announce Type: new Abstract: Generative models form the backbone of modern machine learning, underpinning state-of-the-art systems in text, vision, and multimodal applications. While Maximum Likelihood Estimation has traditionally served as the dominant training paradigm, recent work have highlighted its limitations, particularly in…
-
When Robustness Meets Conservativeness: Conformalized Uncertainty Calibration for Balanced Decision Making
When Robustness Meets Conservativeness: Conformalized Uncertainty Calibration for Balanced Decision Making arXiv:2510.07750v1 Announce Type: new Abstract: Robust optimization safeguards decisions against uncertainty by optimizing against worst-case scenarios, yet their effectiveness hinges on a prespecified robustness level that is often chosen ad hoc, leading to either insufficient protection or overly conservative and costly solutions. Recent approaches…
-
A Honest Cross-Validation Estimator for Prediction Performance
A Honest Cross-Validation Estimator for Prediction Performance arXiv:2510.07649v1 Announce Type: new Abstract: Cross-validation is a standard tool for obtaining a honest assessment of the performance of a prediction model. The commonly used version repeatedly splits data, trains the prediction model on the training set, evaluates the model performance on the test set, and averages the…
-
Surrogate Graph Partitioning for Spatial Prediction
Surrogate Graph Partitioning for Spatial Prediction arXiv:2510.07832v1 Announce Type: new Abstract: Spatial prediction refers to the estimation of unobserved values from spatially distributed observations. Although recent advances have improved the capacity to model diverse observation types, adoption in practice remains limited in industries that demand interpretability. To mitigate this gap, surrogate models that explain black-box…
-
Past is Prologue: How Conversational Analytics Is Changing Data Work
Past is Prologue: How Conversational Analytics Is Changing Data Work The future of reporting will be about encoding the value proposition of a product into prompt design. The post Past is Prologue: How Conversational Analytics Is Changing Data Work appeared first on Towards Data Science. Whitney Marks Go to original source
-
How the Rise of Tabular Foundation Models Is Reshaping Data Science
How the Rise of Tabular Foundation Models Is Reshaping Data Science A turning point for data analysis? The post How the Rise of Tabular Foundation Models Is Reshaping Data Science appeared first on Towards Data Science. Pirmin Lemberger Go to original source
-
Online Matching via Reinforcement Learning: An Expert Policy Orchestration Strategy
Online Matching via Reinforcement Learning: An Expert Policy Orchestration Strategy arXiv:2510.06515v1 Announce Type: new Abstract: Online matching problems arise in many complex systems, from cloud services and online marketplaces to organ exchange networks, where timely, principled decisions are critical for maintaining high system performance. Traditional heuristics in these settings are simple and interpretable but typically…
-
A General Constructive Upper Bound on Shallow Neural Nets Complexity
A General Constructive Upper Bound on Shallow Neural Nets Complexity arXiv:2510.06372v1 Announce Type: new Abstract: We provide an upper bound on the number of neurons required in a shallow neural network to approximate a continuous function on a compact set with a given accuracy. This method, inspired by a specific proof of the Stone-Weierstrass theorem,…
-
Q-Learning with Fine-Grained Gap-Dependent Regret
Q-Learning with Fine-Grained Gap-Dependent Regret arXiv:2510.06647v1 Announce Type: new Abstract: We study fine-grained gap-dependent regret bounds for model-free reinforcement learning in episodic tabular Markov Decision Processes. Existing model-free algorithms achieve minimax worst-case regret, but their gap-dependent bounds remain coarse and fail to fully capture the structure of suboptimality gaps. We address this limitation by establishing…
-
Gaussian Equivalence for Self-Attention: Asymptotic Spectral Analysis of Attention Matrix
Gaussian Equivalence for Self-Attention: Asymptotic Spectral Analysis of Attention Matrix arXiv:2510.06685v1 Announce Type: new Abstract: Self-attention layers have become fundamental building blocks of modern deep neural networks, yet their theoretical understanding remains limited, particularly from the perspective of random matrix theory. In this work, we provide a rigorous analysis of the singular value spectrum of…
-
Bayesian Nonparametric Dynamical Clustering of Time Series
Bayesian Nonparametric Dynamical Clustering of Time Series arXiv:2510.06919v1 Announce Type: new Abstract: We present a method that models the evolution of an unbounded number of time series clusters by switching among an unknown number of regimes with linear dynamics. We develop a Bayesian non-parametric approach using a hierarchical Dirichlet process as a prior on the…
-
Know Your Real Birthday: Astronomical Computation and Geospatial-Temporal Analytics in Python
Know Your Real Birthday: Astronomical Computation and Geospatial-Temporal Analytics in Python A hands-on walkthrough using skyfield, timezonefinder, geopy, and pytz, and further practical applications The post Know Your Real Birthday: Astronomical Computation and Geospatial-Temporal Analytics in Python appeared first on Towards Data Science. Chinmay Kakatkar Go to original source
-
Data Visualization Explained (Part 3): The Role of Color
Data Visualization Explained (Part 3): The Role of Color A simple and powerful guide to using color for more impactful data stories. The post Data Visualization Explained (Part 3): The Role of Color appeared first on Towards Data Science. Murtaza Ali Go to original source
-
Minima and Critical Points of the Bethe Free Energy Are Invariant Under Deformation Retractions of Factor Graphs
Minima and Critical Points of the Bethe Free Energy Are Invariant Under Deformation Retractions of Factor Graphs arXiv:2510.05380v1 Announce Type: new Abstract: In graphical models, factor graphs, and more generally energy-based models, the interactions between variables are encoded by a graph, a hypergraph, or, in the most general case, a partially ordered set (poset). Inference…
-
Refereed Learning
Refereed Learning arXiv:2510.05440v1 Announce Type: new Abstract: We initiate an investigation of learning tasks in a setting where the learner is given access to two competing provers, only one of which is honest. Specifically, we consider the power of such learners in assessing purported properties of opaque models. Following prior work that considers the power…
-
Domain-Shift-Aware Conformal Prediction for Large Language Models
Domain-Shift-Aware Conformal Prediction for Large Language Models arXiv:2510.05566v1 Announce Type: new Abstract: Large language models have achieved impressive performance across diverse tasks. However, their tendency to produce overconfident and factually incorrect outputs, known as hallucinations, poses risks in real world applications. Conformal prediction provides finite-sample, distribution-free coverage guarantees, but standard conformal prediction breaks down under…
-
A Probabilistic Basis for Low-Rank Matrix Learning
A Probabilistic Basis for Low-Rank Matrix Learning arXiv:2510.05447v1 Announce Type: new Abstract: Low rank inference on matrices is widely conducted by optimizing a cost function augmented with a penalty proportional to the nuclear norm $Vert cdot Vert_*$. However, despite the assortment of computational methods for such problems, there is a surprising lack of understanding of…
-
Bilevel optimization for learning hyperparameters: Application to solving PDEs and inverse problems with Gaussian processes
Bilevel optimization for learning hyperparameters: Application to solving PDEs and inverse problems with Gaussian processes arXiv:2510.05568v1 Announce Type: new Abstract: Methods for solving scientific computing and inference problems, such as kernel- and neural network-based approaches for partial differential equations (PDEs), inverse problems, and supervised learning tasks, depend crucially on the choice of hyperparameters. Specifically, the…
-
This Puzzle Shows Just How Far LLMs Have Progressed in a Little Over a Year
This Puzzle Shows Just How Far LLMs Have Progressed in a Little Over a Year What took GPT-4o 2 hours to solve, Sonnet 4.5 does in 5 seconds The post This Puzzle Shows Just How Far LLMs Have Progressed in a Little Over a Year appeared first on Towards Data Science. Thomas Reid Go to original source
-
Quantile-Scaled Bayesian Optimization Using Rank-Only Feedback
Quantile-Scaled Bayesian Optimization Using Rank-Only Feedback arXiv:2510.03277v1 Announce Type: new Abstract: Bayesian Optimization (BO) is widely used for optimizing expensive black-box functions, particularly in hyperparameter tuning. However, standard BO assumes access to precise objective values, which may be unavailable, noisy, or unreliable in real-world settings where only relative or rank-based feedback can be obtained. In…
-
Mathematically rigorous proofs for Shapley explanations
Mathematically rigorous proofs for Shapley explanations arXiv:2510.03281v1 Announce Type: new Abstract: Machine Learning is becoming increasingly more important in today’s world. It is therefore very important to provide understanding of the decision-making process of machine-learning models. A popular way to do this is by looking at the Shapley-Values of these models as introduced by Lundberg…
-
Transformed $ell_1$ Regularizations for Robust Principal Component Analysis: Toward a Fine-Grained Understanding
Transformed $ell_1$ Regularizations for Robust Principal Component Analysis: Toward a Fine-Grained Understanding arXiv:2510.03624v1 Announce Type: new Abstract: Robust Principal Component Analysis (RPCA) aims to recover a low-rank structure from noisy, partially observed data that is also corrupted by sparse, potentially large-magnitude outliers. Traditional RPCA models rely on convex relaxations, such as nuclear norm and $ell_1$…
-
The analogy theorem in Hoare logic
The analogy theorem in Hoare logic arXiv:2510.03685v1 Announce Type: new Abstract: The introduction of machine learning methods has led to significant advances in automation, optimization, and discoveries in various fields of science and technology. However, their widespread application faces a fundamental limitation: the transfer of models between data domains generally lacks a rigorous mathematical justification.…
-
Spectral Thresholds for Identifiability and Stability:Finite-Sample Phase Transitions in High-Dimensional Learning
Spectral Thresholds for Identifiability and Stability:Finite-Sample Phase Transitions in High-Dimensional Learning arXiv:2510.03809v1 Announce Type: new Abstract: In high-dimensional learning, models remain stable until they collapse abruptly once the sample size falls below a critical level. This instability is not algorithm-specific but a geometric mechanism: when the weakest Fisher eigendirection falls beneath sample-level fluctuations, identifiability fails.…
-
How to Perform Effective Agentic Context Engineering
How to Perform Effective Agentic Context Engineering Learn how to optimize the context of your agents, for powerful agentic performance The post How to Perform Effective Agentic Context Engineering appeared first on Towards Data Science. Eivind Kjosbakken Go to original source
-
How I Used ChatGPT to Land My Next Data Science Role
How I Used ChatGPT to Land My Next Data Science Role Practical AI hacks for every stage of the job search — with real prompts and examples The post How I Used ChatGPT to Land My Next Data Science Role appeared first on Towards Data Science. Yu Dong Go to original source
-
How To Build Effective Technical Guardrails for AI Applications
How To Build Effective Technical Guardrails for AI Applications Exploring the most practical guardrails to implement at ground level The post How To Build Effective Technical Guardrails for AI Applications appeared first on Towards Data Science. Nidhin Karunakaran Ponon Go to original source
-
Plotly Dash — A Structured Framework for a Multi-Page Dashboard
Plotly Dash — A Structured Framework for a Multi-Page Dashboard An easy starting point for larger and more complicated Dash dashboards The post Plotly Dash — A Structured Framework for a Multi-Page Dashboard appeared first on Towards Data Science. Michael Clayton Go to original source
-
Higher-arity PAC learning, VC dimension and packing lemma
Higher-arity PAC learning, VC dimension and packing lemma arXiv:2510.02420v1 Announce Type: new Abstract: The aim of this note is to overview some of our work in Chernikov, Towsner’20 (arXiv:2010.00726) developing higher arity VC theory (VC$_n$ dimension), including a generalization of Haussler packing lemma, and an associated tame (slice-wise) hypergraph regularity lemma; and to demonstrate that…
-
Predictive inference for time series: why is split conformal effective despite temporal dependence?
Predictive inference for time series: why is split conformal effective despite temporal dependence? arXiv:2510.02471v1 Announce Type: new Abstract: We consider the problem of uncertainty quantification for prediction in a time series: if we use past data to forecast the next time point, can we provide valid prediction intervals around our forecasts? To avoid placing distributional…
-
Beyond Linear Diffusions: Improved Representations for Rare Conditional Generative Modeling
Beyond Linear Diffusions: Improved Representations for Rare Conditional Generative Modeling arXiv:2510.02499v1 Announce Type: new Abstract: Diffusion models have emerged as powerful generative frameworks with widespread applications across machine learning and artificial intelligence systems. While current research has predominantly focused on linear diffusions, these approaches can face significant challenges when modeling a conditional distribution, $P(Y|X=x)$, when…
-
Adaptive randomized pivoting and volume sampling
Adaptive randomized pivoting and volume sampling arXiv:2510.02513v1 Announce Type: new Abstract: Adaptive randomized pivoting (ARP) is a recently proposed and highly effective algorithm for column subset selection. This paper reinterprets the ARP algorithm by drawing connections to the volume sampling distribution and active learning algorithms for linear regression. As consequences, this paper presents new analysis…