Tag: our
-
How We Are Testing Our Agents in Dev
How We Are Testing Our Agents in Dev Testing that your AI agent is performing as expected is not easy. Here are a few strategies we learned the hard way. The post How We Are Testing Our Agents in Dev appeared first on Towards Data Science. Michael Segner Go to original source
-
Distributionally Robust Online Markov Game with Linear Function Approximation
Distributionally Robust Online Markov Game with Linear Function Approximation arXiv:2511.07831v1 Announce Type: new Abstract: The sim-to-real gap, where agents trained in a simulator face significant performance degradation during testing, is a fundamental challenge in reinforcement learning. Extansive works adopt the framework of distributionally robust RL, to learn a policy that acts robustly under worst case…
-
Private Learning of Littlestone Classes, Revisited
Private Learning of Littlestone Classes, Revisited arXiv:2510.00076v1 Announce Type: new Abstract: We consider online and PAC learning of Littlestone classes subject to the constraint of approximate differential privacy. Our main result is a private learner to online-learn a Littlestone class with a mistake bound of $tilde{O}(d^{9.5}cdot log(T))$ in the realizable case, where $d$ denotes the…
-
Model-free algorithms for fast node clustering in SBM type graphs and application to social role inference in animals
Model-free algorithms for fast node clustering in SBM type graphs and application to social role inference in animals arXiv:2509.15989v1 Announce Type: new Abstract: We propose a novel family of model-free algorithms for node clustering and parameter inference in graphs generated from the Stochastic Block Model (SBM), a fundamental framework in community detection. Drawing inspiration from…
-
If we use AI to do our work – what is our job, then?
If we use AI to do our work – what is our job, then? Images. Text. Audio. There’s no modality that is not handled by AI. And AI systems reach even further, planning advertisement and marketing campaigns, automating social media postings, … Most of this was unthinkable a mere ten years ago. But then, the…
-
Transfer Learning for Classification under Decision Rule Drift with Application to Optimal Individualized Treatment Rule Estimation
Transfer Learning for Classification under Decision Rule Drift with Application to Optimal Individualized Treatment Rule Estimation arXiv:2508.20942v1 Announce Type: new Abstract: In this paper, we extend the transfer learning classification framework from regression function-based methods to decision rules. We propose a novel methodology for modeling posterior drift through Bayes decision rules. By exploiting the geometric…
-
Underdamped Langevin MCMC with third order convergence
Underdamped Langevin MCMC with third order convergence arXiv:2508.16485v1 Announce Type: new Abstract: In this paper, we propose a new numerical method for the underdamped Langevin diffusion (ULD) and present a non-asymptotic analysis of its sampling error in the 2-Wasserstein distance when the $d$-dimensional target distribution $p(x)propto e^{-f(x)}$ is strongly log-concave and has varying degrees of…
-
Distributional Sensitivity Analysis: Enabling Differentiability in Sample-Based Inference
Distributional Sensitivity Analysis: Enabling Differentiability in Sample-Based Inference arXiv:2508.09347v1 Announce Type: new Abstract: We present two analytical formulae for estimating the sensitivity — namely, the gradient or Jacobian — at given realizations of an arbitrary-dimensional random vector with respect to its distributional parameters. The first formula interprets this sensitivity as partial derivatives of the inverse…
-
Differentially Private Model-X Knockoffs via Johnson-Lindenstrauss Transform
Differentially Private Model-X Knockoffs via Johnson-Lindenstrauss Transform arXiv:2508.04800v1 Announce Type: new Abstract: We introduce a novel privatization framework for high-dimensional controlled variable selection. Our framework enables rigorous False Discovery Rate (FDR) control under differential privacy constraints. While the Model-X knockoff procedure provides FDR guarantees by constructing provably exchangeable “negative control” features, existing privacy mechanisms like…
-
Extracting Interpretable Models from Tree Ensembles: Computational and Statistical Perspectives
Extracting Interpretable Models from Tree Ensembles: Computational and Statistical Perspectives arXiv:2506.20114v1 Announce Type: new Abstract: Tree ensembles are non-parametric methods widely recognized for their accuracy and ability to capture complex interactions. While these models excel at prediction, they are difficult to interpret and may fail to uncover useful relationships in the data. We propose an…
-
Oh SnapMMD! Forecasting Stochastic Dynamics Beyond the Schr”odinger Bridge’s End
Oh SnapMMD! Forecasting Stochastic Dynamics Beyond the Schr”odinger Bridge’s End arXiv:2505.16082v1 Announce Type: new Abstract: Scientists often want to make predictions beyond the observed time horizon of “snapshot” data following latent stochastic dynamics. For example, in time course single-cell mRNA profiling, scientists have access to cellular transcriptional state measurements (snapshots) from different biological replicates at…
-
Infinite hierarchical contrastive clustering for personal digital envirotyping
Infinite hierarchical contrastive clustering for personal digital envirotyping arXiv:2505.15022v1 Announce Type: new Abstract: Daily environments have profound influence on our health and behavior. Recent work has shown that digital envirotyping, where computer vision is applied to images of daily environments taken during ecological momentary assessment (EMA), can be used to identify meaningful relationships between environmental…
-
We Need a Fourth Law of Robotics in the Age of AI
We Need a Fourth Law of Robotics in the Age of AI Artificial Intelligence has become a mainstay of our daily lives, revolutionizing industries, accelerating scientific discoveries, and reshaping how we communicate. Yet, alongside its undeniable benefits, AI has also ignited a range of ethical and social dilemmas that our existing regulatory frameworks have struggled…
-
On Model Protection in Federated Learning against Eavesdropping Attacks
On Model Protection in Federated Learning against Eavesdropping Attacks arXiv:2504.02114v1 Announce Type: cross Abstract: In this study, we investigate the protection offered by federated learning algorithms against eavesdropping adversaries. In our model, the adversary is capable of intercepting model updates transmitted from clients to the server, enabling it to create its own estimate of the…
-
Backdoor Detection through Replicated Execution of Outsourced Training
Backdoor Detection through Replicated Execution of Outsourced Training arXiv:2504.00170v1 Announce Type: cross Abstract: It is common practice to outsource the training of machine learning models to cloud providers. Clients who do so gain from the cloud’s economies of scale, but implicitly assume trust: the server should not deviate from the client’s training procedure. A malicious…
-
SNPL: Simultaneous Policy Learning and Evaluation for Safe Multi-Objective Policy Improvement
SNPL: Simultaneous Policy Learning and Evaluation for Safe Multi-Objective Policy Improvement arXiv:2503.12760v1 Announce Type: new Abstract: To design effective digital interventions, experimenters face the challenge of learning decision policies that balance multiple objectives using offline data. Often, they aim to develop policies that maximize goal outcomes, while ensuring there are no undesirable changes in guardrail…
-
Experiments Illustrated: Can $1 Change Behavior More Than $100?
Experiments Illustrated: Can $1 Change Behavior More Than $100? I currently lead a small data team at a small tech company. With everything small, we have a lot of autonomy over what, when, and how we run experiments. In this series, I’m opening the vault from our years of experimenting, each story highlighting a key…
-
Write for Towards Data Science
Write for Towards Data Science Quick Links: Submission Guidelines How To Submit Your Work How to get your article ready for publication! Adding and using images Longform posts, columns, and online books FAQ Why become a contributor? We are looking for writers to propose up-to-date content focused on data science, machine learning, artificial intelligence and…
-
Fr’echet Cumulative Covariance Net for Deep Nonlinear Sufficient Dimension Reduction with Random Objects
Fr’echet Cumulative Covariance Net for Deep Nonlinear Sufficient Dimension Reduction with Random Objects arXiv:2502.15374v1 Announce Type: new Abstract: Nonlinear sufficient dimension reductioncitep{libing_generalSDR}, which constructs nonlinear low-dimensional representations to summarize essential features of high-dimensional data, is an important branch of representation learning. However, most existing methods are not applicable when the response variables are complex non-Euclidean…
-
Building a Data Engineering Center of Excellence
Building a Data Engineering Center of Excellence As data continues to grow in importance and become more complex, the need for skilled data engineers has never been greater. But what is data engineering, and why is it so important? In this blog post, we will discuss the essential components of a functioning data engineering practice…
-
Poisson Hierarchical Indian Buffet Processes for Within and Across Group Sharing of Latent Features-With Indications for Microbiome Species Sampling Models
Poisson Hierarchical Indian Buffet Processes for Within and Across Group Sharing of Latent Features-With Indications for Microbiome Species Sampling Models arXiv:2502.01919v1 Announce Type: new Abstract: In this work, we present a comprehensive Bayesian posterior analysis of what we term Poisson Hierarchical Indian Buffet Processes, designed for complex random sparse count species sampling models that allow…
-
Towards Data Science is Launching as an Independent Publication
Towards Data Science is Launching as an Independent Publication Since founding Towards Data Science in 2016, we’ve built the largest publication on Medium with a dedicated community of readers and contributors focused on data science, machine learning, and AI. Medium built a fantastic platform, and we wouldn’t have been able to reach our audience without…
-
Trustworthy Evaluation of Generative AI Models
Trustworthy Evaluation of Generative AI Models arXiv:2501.18897v1 Announce Type: new Abstract: Generative AI (GenAI) models have recently achieved remarkable empirical performance in various applications, however, their evaluations yet lack uncertainty quantification. In this paper, we propose a method to compare two generative models based on an unbiased estimator of their relative performance gap. Statistically, our…
-
Semantically Compress Text to Save On LLM Costs
Semantically Compress Text to Save On LLM Costs LLMs are great… if they can fit all of your data Photo by Christopher Burns on Unsplash Originally published at https://blog.developer.bazaarvoice.com on October 28, 2024. Introduction Large language models are fantastic tools for unstructured text, but what if your text doesn’t fit in the context window? Bazaarvoice faced exactly this…