Category: aimldsaimlds
-
Distribution free uncertainty quantification in neuroscience-inspired deep operators
Distribution free uncertainty quantification in neuroscience-inspired deep operators arXiv:2412.09369v1 Announce Type: new Abstract: Energy-efficient deep learning algorithms are essential for a sustainable future and feasible edge computing setups. Spiking neural networks (SNNs), inspired from neuroscience, are a positive step in the direction of achieving the required energy efficiency. However, in a bid to lower the…
-
Why Retrieval-Augmented Generation Is Still Relevant in the Era of Long-Context Language Models
Why Retrieval-Augmented Generation Is Still Relevant in the Era of Long-Context Language Models In this article we will explore why 128K tokens and more models can’t fully replace using RAG. Continue reading on Towards Data Science » Jérôme DIAZ Go to original source
-
Transformers Key-Value (KV) Caching Explained
Transformers Key-Value (KV) Caching Explained Speed up your LLM inference Continue reading on Towards Data Science » Michał Oleszak Go to original source
-
CV VideoPlayer — Once and For All
CV VideoPlayer — Once and For All CV VideoPlayer — Once and For All A Python video player package made for computer vision research Image by author When developing computer vision algorithms, the journey from concept to working implementation often involves countless iterations of watching, analyzing, and debugging video frames. As I dove deeper into computer vision projects, I found myself repeatedly…
-
Sentiment analysis template: A complete data science project
Sentiment analysis template: A complete data science project 10 essential steps, from data exploration to model deployment. Continue reading on Towards Data Science » Leo Anello Go to original source
-
Why “AI Can’t Reason” Is a Bias
Why “AI Can’t Reason” Is a Bias We humans are proud creatures Continue reading on Towards Data Science » Rafe Brena, Ph.D. Go to original source
-
Score-Optimal Diffusion Schedules
Score-Optimal Diffusion Schedules arXiv:2412.07877v1 Announce Type: new Abstract: Denoising diffusion models (DDMs) offer a flexible framework for sampling from high dimensional data distributions. DDMs generate a path of probability distributions interpolating between a reference Gaussian distribution and a data distribution by incrementally injecting noise into the data. To numerically simulate the sampling process, a discretisation…
-
Low-Rank Correction for Quantized LLMs
Low-Rank Correction for Quantized LLMs arXiv:2412.07902v1 Announce Type: new Abstract: We consider the problem of model compression for Large Language Models (LLMs) at post-training time, where the task is to compress a well-trained model using only a small set of calibration input data. In this work, we introduce a new low-rank approach to correct for…
-
An Optimistic Algorithm for Online Convex Optimization with Adversarial Constraints
An Optimistic Algorithm for Online Convex Optimization with Adversarial Constraints arXiv:2412.08060v1 Announce Type: new Abstract: We study Online Convex Optimization (OCO) with adversarial constraints, where an online algorithm must make repeated decisions to minimize both convex loss functions and cumulative constraint violations. We focus on a setting where the algorithm has access to predictions of…
-
Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models
Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models arXiv:2412.07972v1 Announce Type: cross Abstract: We analyze the training of a two-layer autoencoder used to parameterize a flow-based generative model for sampling from a high-dimensional Gaussian mixture. Previous work shows that the phase where the relative probability between the modes is learned disappears as the dimension…
-
Spectral Differential Network Analysis for High-Dimensional Time Series
Spectral Differential Network Analysis for High-Dimensional Time Series arXiv:2412.07905v1 Announce Type: cross Abstract: Spectral networks derived from multivariate time series data arise in many domains, from brain science to Earth science. Often, it is of interest to study how these networks change under different conditions. For instance, to better understand epilepsy, it would be interesting…
-
3 Business Skills You Need to Progress Your Data Science Career in 2025
3 Business Skills You Need to Progress Your Data Science Career in 2025 DATA SCIENCE Including resources for how to build those skills Image by Author. Created using Midjourney If you have been a data scientist for a while, sooner or later you’ll notice that your day-to-day has shifted from a VSCode-loving, research paper-reading, git-version-committing data…
-
Translating a Memoir: A Technical Journey
Translating a Memoir: A Technical Journey Leveraging GPT-3.5 and unstructured APIs for translations This blog post details how I utilised GPT to translate the personal memoir of a family friend, making it accessible to a broader audience. Specifically, I employed GPT-3.5 for translation and Unstructured’s APIs for efficient content extraction and formatting. The memoir, a…
-
5 Essential Tips to Build Business Dashboards Stakeholders Love
5 Essential Tips to Build Business Dashboards Stakeholders Love A practical guide to designing clear, effective, and actionable dashboards for decision-making Continue reading on Towards Data Science » Yu Dong Go to original source
-
I’m Doing the Advent of Code 2024 in Python — Day 2
I’m Doing the Advent of Code 2024 in Python — Day 2 Let’s see how many stars we’ll collect. Continue reading on Towards Data Science » Soner Yıldırım Go to original source
-
Measuring the Cost of Production Issues on Development Teams
Measuring the Cost of Production Issues on Development Teams Deprioritizing quality sacrifices both software stability and velocity, leading to costly issues. Investing in quality boosts speed and outcomes. Image by the author. (AI generated Midjourney) Investing in software quality is often easier said than done. Although many engineering managers express a commitment to high-quality software,…
-
Generalized Least Squares Kernelized Tensor Factorization
Generalized Least Squares Kernelized Tensor Factorization arXiv:2412.07041v1 Announce Type: new Abstract: Real-world datasets often contain missing or corrupted values. Completing multidimensional tensor-structured data with missing entries is essential for numerous applications. Smoothness-constrained low-rank factorization models have shown superior performance with reduced computational costs. While effective at capturing global and long-range correlations, these models struggle to…
-
Sequential Controlled Langevin Diffusions
Sequential Controlled Langevin Diffusions arXiv:2412.07081v1 Announce Type: new Abstract: An effective approach for sampling from unnormalized densities is based on the idea of gradually transporting samples from an easy prior to the complicated target distribution. Two popular methods are (1) Sequential Monte Carlo (SMC), where the transport is performed through successive annealed densities via prescribed…
-
A Note on Sample Complexity of Interactive Imitation Learning with Log Loss
A Note on Sample Complexity of Interactive Imitation Learning with Log Loss arXiv:2412.07057v1 Announce Type: new Abstract: Imitation learning (IL) is a general paradigm for learning from experts in sequential decision-making problems. Recent advancements in IL have shown that offline imitation learning, specifically Behavior Cloning (BC) with log loss, is minimax optimal. Meanwhile, its interactive…
-
Optimization Can Learn Johnson Lindenstrauss Embeddings
Optimization Can Learn Johnson Lindenstrauss Embeddings arXiv:2412.07242v1 Announce Type: new Abstract: Embeddings play a pivotal role across various disciplines, offering compact representations of complex data structures. Randomized methods like Johnson-Lindenstrauss (JL) provide state-of-the-art and essentially unimprovable theoretical guarantees for achieving such representations. These guarantees are worst-case and in particular, neither the analysis, nor the algorithm,…
-
Modeling High-Resolution Spatio-Temporal Wind with Deep Echo State Networks and Stochastic Partial Differential Equations
Modeling High-Resolution Spatio-Temporal Wind with Deep Echo State Networks and Stochastic Partial Differential Equations arXiv:2412.07265v1 Announce Type: new Abstract: In the past decades, clean and renewable energy has gained increasing attention due to a global effort on carbon footprint reduction. In particular, Saudi Arabia is gradually shifting its energy portfolio from an exclusive use of…
-
Missing Data in Time-Series: Machine Learning Techniques
Missing Data in Time-Series: Machine Learning Techniques Part 1: Leverage linear regression and decision trees to impute time-series gaps. Continue reading on Towards Data Science » Sara Nóbrega Go to original source
-
Awesome Plotly with Code Series (Part 5): The Order in Bar Charts Matters
Awesome Plotly with Code Series (Part 5): The Order in Bar Charts Matters And it is not always simply ordering by highest to lowest Continue reading on Towards Data Science » Jose Parreño Go to original source
-
How to Apply the Central Limit Theorem to Constrained Data
How to Apply the Central Limit Theorem to Constrained Data What can we say about the mean of data distributed in an interval [a, b]? Continue reading on Towards Data Science » Ryan Burn Go to original source
-
How to Use Structured Generation for LLM-as-a-Judge Evaluations
How to Use Structured Generation for LLM-as-a-Judge Evaluations Structured generation is fundamental to building complex, multi-step reasoning agents in LLM evaluations — especially for open source models Source: Generated with SDXL 1.0 Disclosure: I am a maintainer of Opik, one of the open source projects used later in this article. For the past few months, I’ve been working on LLM-based…
-
Nobel Prizes 2024: AI Breakthroughs Win Big
Nobel Prizes 2024: AI Breakthroughs Win Big Lessons Learned After the AI Nobel Debate Continue reading on Towards Data Science » Andrea Valenzuela Go to original source
-
Ranking of Large Language Model with Nonparametric Prompts
Ranking of Large Language Model with Nonparametric Prompts arXiv:2412.05506v1 Announce Type: new Abstract: We consider the inference for the ranking of large language models (LLMs). Alignment arises as a big challenge to mitigate hallucinations in the use of LLMs. Ranking LLMs has been shown as a well-performing tool to improve alignment based on the best-of-$N$…
-
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models arXiv:2412.05723v1 Announce Type: new Abstract: Estimating the uncertainty of responses of Large Language Models~(LLMs) remains a critical challenge. While recent Bayesian methods have demonstrated effectiveness in quantifying uncertainty through low-rank weight updates, they typically require complex fine-tuning or post-training procedures. In this paper, we propose Training-Free…
-
Proximal Iteration for Nonlinear Adaptive Lasso
Proximal Iteration for Nonlinear Adaptive Lasso arXiv:2412.05726v1 Announce Type: new Abstract: Augmenting a smooth cost function with an $ell_1$ penalty allows analysts to efficiently conduct estimation and variable selection simultaneously in sophisticated models and can be efficiently implemented using proximal gradient methods. However, one drawback of the $ell_1$ penalty is bias: nonzero parameters are underestimated…
-
Leveraging Black-box Models to Assess Feature Importance in Unconditional Distribution
Leveraging Black-box Models to Assess Feature Importance in Unconditional Distribution arXiv:2412.05759v1 Announce Type: new Abstract: Understanding how changes in explanatory features affect the unconditional distribution of the outcome is important in many applications. However, existing black-box predictive models are not readily suited for analyzing such questions. In this work, we develop an approximation method to…
-
Reinforcement Learning for a Discrete-Time Linear-Quadratic Control Problem with an Application
Reinforcement Learning for a Discrete-Time Linear-Quadratic Control Problem with an Application arXiv:2412.05906v1 Announce Type: new Abstract: We study the discrete-time linear-quadratic (LQ) control model using reinforcement learning (RL). Using entropy to measure the cost of exploration, we prove that the optimal feedback policy for the problem must be Gaussian type. Then, we apply the results…
-
The Case Against Centralized Medallion Architecture
The Case Against Centralized Medallion Architecture Why tailored, decentralized data quality trumps the medallion architecture Continue reading on Towards Data Science » Bernd Wessely Go to original source
-
Uncertainty Quantification in Time Series Forecasting
Uncertainty Quantification in Time Series Forecasting A deep dive into EnbPI, a Conformal Prediction approach for time series forecasting Continue reading on Towards Data Science » Jonte Dancker Go to original source
-
How to Evaluate Multilingual LLMs With Global-MMLU
How to Evaluate Multilingual LLMs With Global-MMLU Evaluation of language-specific LLM accuracy on the global Massive Multitask Language Understanding benchmark in Python Continue reading on Towards Data Science » Dr. Leon Eversberg Go to original source
-
Here’s What I Learned About Information Theory Through Wordle
Here’s What I Learned About Information Theory Through Wordle The Science Behind Better Guesses Continue reading on Towards Data Science » Saankhya Mondal Go to original source
-
Why Data Scientists Need These Software Engineering Skills
Why Data Scientists Need These Software Engineering Skills Learn these things to become a more well-rounded data scientist Continue reading on Towards Data Science » Egor Howell Go to original source
-
The Polynomial Stein Discrepancy for Assessing Moment Convergence
The Polynomial Stein Discrepancy for Assessing Moment Convergence arXiv:2412.05135v1 Announce Type: new Abstract: We propose a novel method for measuring the discrepancy between a set of samples and a desired posterior distribution for Bayesian inference. Classical methods for assessing sample quality like the effective sample size are not appropriate for scalable Bayesian sampling algorithms, such…
-
Semiparametric Bayesian Difference-in-Differences
Semiparametric Bayesian Difference-in-Differences arXiv:2412.04605v1 Announce Type: cross Abstract: This paper studies semiparametric Bayesian inference for the average treatment effect on the treated (ATT) within the difference-in-differences research design. We propose two new Bayesian methods with frequentist validity. The first one places a standard Gaussian process prior on the conditional mean function of the control group.…
-
Disentangled Representation Learning for Causal Inference with Instruments
Disentangled Representation Learning for Causal Inference with Instruments arXiv:2412.04641v1 Announce Type: cross Abstract: Latent confounders are a fundamental challenge for inferring causal effects from observational data. The instrumental variable (IV) approach is a practical way to address this challenge. Existing IV based estimators need a known IV or other strong assumptions, such as the existence…
-
Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise
Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise arXiv:2412.04648v1 Announce Type: cross Abstract: Recorrupted-to-Recorrupted (R2R) has emerged as a methodology for training deep networks for image restoration in a self-supervised manner from noisy measurement data alone, demonstrating equivalence in expectation to the supervised squared loss in the case of Gaussian noise. However, its effectiveness with non-Gaussian…
-
Modeling High-Dimensional Dependent Data in the Presence of Many Explanatory Variables and Weak Signals
Modeling High-Dimensional Dependent Data in the Presence of Many Explanatory Variables and Weak Signals arXiv:2412.04736v1 Announce Type: cross Abstract: This article considers a novel and widely applicable approach to modeling high-dimensional dependent data when a large number of explanatory variables are available and the signal-to-noise ratio is low. We postulate that a $p$-dimensional response series…
-
Weekly Entering & Transitioning – Thread 09 Dec, 2024 – 16 Dec, 2024
Weekly Entering & Transitioning – Thread 09 Dec, 2024 – 16 Dec, 2024 Welcome to this week’s entering & transitioning thread! This thread is for any questions about getting started, studying, or transitioning into the data science field. Topics include: Learning resources (e.g. books, tutorials, videos) Traditional education (e.g. schools, degrees, electives) Alternative education (e.g.…
-
Is your org treating the rollout of LLMs as an IT or data science problem?
Is your org treating the rollout of LLMs as an IT or data science problem? Our org has given all resource (and limited all API access) to LLMs to a dedicated team in the IT department, which has no prior data experience. So far no data scientist has been engaged for feedback on design or…
-
Are certifications even worth it these days?
Are certifications even worth it these days? So, I’m a cs major stats minor undergrad, and I’ve done a couple of certifications—AWS Cloud Practitioner and IBM Data Science. Honestly, I’m not sure if they added much value. In one interview, I mentioned my certifications right at the end, and they didn’t even seem to notice.…
-
How to find freelance opportunities – what is the most typical troupe of project you do as freelance
How to find freelance opportunities – what is the most typical troupe of project you do as freelance Hi all, I have 5+ years of experience. I’m based in Europe Lately I’m thinking switch from full time employee to contractor, doing freelancing and working for different companies at the same time. I think that freelancing…
-
Timeseries pattern detection problem
Timeseries pattern detection problem I’ve never dealt with any time series data – please help me understand if I’m reinventing the wheel or on the right track. I’m building a little hobby app, which is a habit tracker of sorts. The idea is that it lets the user record things they’ve done, on a daily…
-
Streamline Your Workflow when Starting a New Research Paper
Streamline Your Workflow when Starting a New Research Paper Python code to create folders and Word documents for research papers in biomedical sciences — all in one go with only two inputs Continue reading on Towards Data Science » Rodrigo M Carrillo Larco, MD, PhD Go to original source
-
AI, My Holiday Elf: Building a Gift Recommender for the Perfect Christmas
AI, My Holiday Elf: Building a Gift Recommender for the Perfect Christmas How I used AI and Streamlit to create a festive and fun gift recommendation app Continue reading on Towards Data Science » Shuqing Ke Go to original source
-
Scientists Go Serious About Large Language Models Mirroring Human Thinking
Scientists Go Serious About Large Language Models Mirroring Human Thinking A discussion of the latest research suggesting that LLMs do work like the human brain—with some substantial differences Continue reading on Towards Data Science » LucianoSphere (Luciano Abriata, PhD) Go to original source
-
My #30DayMapChallenge 2024
My #30DayMapChallenge 2024 30 Days, 30 Maps: My November Adventure in Digital Cartography Continue reading on Towards Data Science » Glenn Kong Go to original source
-
How to Prepare for Your Data Science Behavioural Interview
How to Prepare for Your Data Science Behavioural Interview My top tips to smash your next data science behavioural interview Continue reading on Towards Data Science » Egor Howell Go to original source
-
I’m Doing the Advent of Code 2024 in Python — Day 1
I’m Doing the Advent of Code 2024 in Python — Day 1 Let’s see how many stars we’ll collect. Continue reading on Towards Data Science » Soner Yıldırım Go to original source
-
Modeling DAU with Markov Chain
Modeling DAU with Markov Chain How to predict DAU using Duolingo’s growth model and control the prediction 1. Introduction Doubtlessly, DAU, WAU, and MAU — daily, weekly, and monthly active users — are critical business metrics. An article “How Duolingo reignited user growth” by Jorge Mazal, former CPO of Duolingo, is #1 in the Growth section of Lenny’s Newsletter…
-
Combining Large and Small LLMs to Boost Inference Time and Quality
Combining Large and Small LLMs to Boost Inference Time and Quality Implementing Speculative and Contrastive Decoding Large Language models are comprised of billions of parameters (weights). For each word it generates, the model has to perform computationally expensive calculations across all of these parameters. Large Language models accept a sentence, or sequence of tokens, and…
-
How to Integrate AI and Data Science into Your Business Strategy
How to Integrate AI and Data Science into Your Business Strategy DATA SCIENCE CONSULTING Insider consulting guide to conducting a successful 2-day executive workshop Image by author using Canva “Our industry does not respect tradition — it only respects innovation.” — Satya Nadella, CEO Microsoft, Letter to employees in 2014 While not all industries are as competitive and cutthroat as the…
-
Reinforcement Learning: Self-Driving Cars to Self-Driving Labs
Reinforcement Learning: Self-Driving Cars to Self-Driving Labs Understanding AI applications in bio for machine learning engineers Photo by Ousa Chea on Unsplash Anyone who has tried teaching a dog new tricks knows the basics of reinforcement learning. We can modify the dog’s behavior by repeatedly offering rewards for obedience and punishments for misbehavior. In reinforcement learning…
-
Asymptotics of Linear Regression with Linearly Dependent Data
Asymptotics of Linear Regression with Linearly Dependent Data arXiv:2412.03702v1 Announce Type: new Abstract: In this paper we study the asymptotics of linear regression in settings where the covariates exhibit a linear dependency structure, departing from the standard assumption of independence. We model the covariates using stochastic processes with spatio-temporal covariance and analyze the performance of…
-
Community Detection with Heterogeneous Block Covariance Model
Community Detection with Heterogeneous Block Covariance Model arXiv:2412.03780v1 Announce Type: new Abstract: Community detection is the task of clustering objects based on their pairwise relationships. Most of the model-based community detection methods, such as the stochastic block model and its variants, are designed for networks with binary (yes/no) edges. In many practical scenarios, edges often…
-
Learning Networks from Wide-Sense Stationary Stochastic Processes
Learning Networks from Wide-Sense Stationary Stochastic Processes arXiv:2412.03768v1 Announce Type: new Abstract: Complex networked systems driven by latent inputs are common in fields like neuroscience, finance, and engineering. A key inference problem here is to learn edge connectivity from node outputs (potentials). We focus on systems governed by steady-state linear conservation laws: $X_t = {L^{ast}}Y_{t}$,…
-
How well behaved is finite dimensional Diffusion Maps?
How well behaved is finite dimensional Diffusion Maps? arXiv:2412.03992v1 Announce Type: new Abstract: Under a set of assumptions on a family of submanifolds $subset {mathbb R}^D$, we derive a series of geometric properties that remain valid after finite-dimensional and almost isometric Diffusion Maps (DM), including almost uniform density, finite polynomial approximation and local reach. Leveraging…
-
Pathwise optimization for bridge-type estimators and its applications
Pathwise optimization for bridge-type estimators and its applications arXiv:2412.04047v1 Announce Type: new Abstract: Sparse parametric models are of great interest in statistical learning and are often analyzed by means of regularized estimators. Pathwise methods allow to efficiently compute the full solution path for penalized estimators, for any possible value of the penalization parameter $lambda$. In…
-
Bridging the Data Literacy Gap
Bridging the Data Literacy Gap The Advent, Evolution, and Current state of “Data Translators” Introduction With Data being constantly glorified as the most valuable asset organizations can own, leaders and decision-makers are always looking for effective ways to put their data insights to use. Every time customers interact with digital products, millions of data points…
-
Multimodal RAG: Process Any File Type with AI
Multimodal RAG: Process Any File Type with AI A beginner-friendly guide with example (Python) code This is the third article in a larger series on multimodal AI. In the previous posts, we discussed multimodal LLMs and embedding models, respectively. In this article, we will combine these ideas to enable the development of multimodal RAG systems. I’ll…
-
Chat with Your Images using Multimodal LLMs
Chat with Your Images using Multimodal LLMs Chat with Your Images Using Llama 3.2-Vision Multimodal LLMs Learn how to build Llama 3.2-Vision locally in a chat-like mode, and explore its Multimodal skills on a Colab notebook Annotated image by author. Original image by Pixabay. Introduction The integration of vision capabilities with Large Language Models (LLMs) is revolutionizing…
-
Who Does What in Data? A Practical Introduction to the Role of a Data Engineer & Data Scientist
Who Does What in Data? A Practical Introduction to the Role of a Data Engineer & Data Scientist What does a data engineer do differently to a data scientist? Continue reading on Towards Data Science » Sarah Lea Go to original source
-
GPS Interpolation Using Maps and Kinematics
GPS Interpolation Using Maps and Kinematics How do you apply dead reckoning to your geospatial dataset? The picture above illustrates the GPS interpolation process. The red dots represent the known and repeated GPS locations, with more than one location per dot, while the blue dots represent the inferred locations of the repeated points along the…
-
Universal Rates of Empirical Risk Minimization
Universal Rates of Empirical Risk Minimization arXiv:2412.02810v1 Announce Type: new Abstract: The well-known empirical risk minimization (ERM) principle is the basis of many widely used machine learning algorithms, and plays an essential role in the classical PAC theory. A common description of a learning algorithm’s performance is its so-called “learning curve”, that is, the decay…
-
An Information-Theoretic Analysis of Thompson Sampling for Logistic Bandits
An Information-Theoretic Analysis of Thompson Sampling for Logistic Bandits arXiv:2412.02861v1 Announce Type: new Abstract: We study the performance of the Thompson Sampling algorithm for logistic bandit problems, where the agent receives binary rewards with probabilities determined by a logistic function $exp(beta langle a, theta rangle)/(1+exp(beta langle a, theta rangle))$. We focus on the setting where…
-
Preference-based Pure Exploration
Preference-based Pure Exploration arXiv:2412.02988v1 Announce Type: new Abstract: We study the preference-based pure exploration problem for bandits with vector-valued rewards. The rewards are ordered using a (given) preference cone $mathcal{C}$ and our the goal is to identify the set of Pareto optimal arms. First, to quantify the impact of preferences, we derive a novel lower…
-
Generalized Diffusion Model with Adjusted Offset Noise
Generalized Diffusion Model with Adjusted Offset Noise arXiv:2412.03134v1 Announce Type: new Abstract: Diffusion models have become fundamental tools for modeling data distributions in machine learning and have applications in image generation, drug discovery, and audio synthesis. Despite their success, these models face challenges when generating data with extreme brightness values, as evidenced by limitations in…
-
Nonparametric Filtering, Estimation and Classification using Neural Jump ODEs
Nonparametric Filtering, Estimation and Classification using Neural Jump ODEs arXiv:2412.03271v1 Announce Type: new Abstract: Neural Jump ODEs model the conditional expectation between observations by neural ODEs and jump at arrival of new observations. They have demonstrated effectiveness for fully data-driven online forecasting in settings with irregular and partial observations, operating under weak regularity assumptions. This…
-
How to Build a General-Purpose LLM Agent
How to Build a General-Purpose LLM Agent A Step-by-Step Guide High-level Overview of an LLM Agent. (Image by author) Why build a general-purpose agent? Because it’s an excellent tool to prototype your use cases and lays the groundwork for designing your own custom agentic architecture. Before we dive in, let’s quickly introduce LLM agents. Feel free…
-
How to Interpret Matrix Expressions — Transformations
How to Interpret Matrix Expressions — Transformations Matrix algebra for a data scientist Photo by Ben Allan on Unsplash This article begins a series for anyone who finds matrix algebra overwhelming. My goal is to turn what you’re afraid of into what you’re fascinated by. You’ll find it especially helpful if you want to understand machine learning concepts…
-
What Teaching AI Taught me About Data Skills & People
What Teaching AI Taught me About Data Skills & People Three key lessons from my journey as a corporate AI educator Photo by Mikhail Nilov. As an AI Educator, my job was to equip corporate teams with the data & AI skills they needed to thrive. But looking back, I realized that I learned far more from…
-
Introducing Univariate Exemplar Recommenders: how to profile Customer Behavior in a single vector
Introducing Univariate Exemplar Recommenders: how to profile Customer Behavior in a single vector Customer Profiling Surveying and improving the current methodologies for customer profiling ***To understand this article, knowledge of embeddings, clustering, and recommendation systems is required. The implementation of this algorithm has been released on GitHub and is fully open-source. I am open to…
-
Step-by-Step Guide for Building Bump Charts in Plotly
Step-by-Step Guide for Building Bump Charts in Plotly Learn how to create custom bump charts in Python using Plotly for data visualization Continue reading on Towards Data Science » Amanda Iglesias Moreno Go to original source
-
MEP-Net: Generating Solutions to Scientific Problems with Limited Knowledge by Maximum Entropy Principle
MEP-Net: Generating Solutions to Scientific Problems with Limited Knowledge by Maximum Entropy Principle arXiv:2412.02090v1 Announce Type: new Abstract: Maximum entropy principle (MEP) offers an effective and unbiased approach to inferring unknown probability distributions when faced with incomplete information, while neural networks provide the flexibility to learn complex distributions from data. This paper proposes a novel…
-
Selective Reviews of Bandit Problems in AI via a Statistical View
Selective Reviews of Bandit Problems in AI via a Statistical View arXiv:2412.02251v1 Announce Type: new Abstract: Reinforcement Learning (RL) is a widely researched area in artificial intelligence that focuses on teaching agents decision-making through interactions with their environment. A key subset includes stochastic multi-armed bandit (MAB) and continuum-armed bandit (SCAB) problems, which model sequential decision-making…
-
The Broader Landscape of Robustness in Algorithmic Statistics
The Broader Landscape of Robustness in Algorithmic Statistics arXiv:2412.02670v1 Announce Type: new Abstract: The last decade has seen a number of advances in computationally efficient algorithms for statistical methods subject to robustness constraints. An estimator may be robust in a number of different ways: to contamination of the dataset, to heavy-tailed data, or in the…
-
Deep Matrix Factorization with Adaptive Weights for Multi-View Clustering
Deep Matrix Factorization with Adaptive Weights for Multi-View Clustering arXiv:2412.02292v1 Announce Type: new Abstract: Recently, deep matrix factorization has been established as a powerful model for unsupervised tasks, achieving promising results, especially for multi-view clustering. However, existing methods often lack effective feature selection mechanisms and rely on empirical hyperparameter selection. To address these issues, we…
-
Composition of Experts: A Modular Compound AI System Leveraging Large Language Models
Composition of Experts: A Modular Compound AI System Leveraging Large Language Models arXiv:2412.01868v1 Announce Type: cross Abstract: Large Language Models (LLMs) have achieved remarkable advancements, but their monolithic nature presents challenges in terms of scalability, cost, and customization. This paper introduces the Composition of Experts (CoE), a modular compound AI system leveraging multiple expert LLMs.…
-
Query Optimization for Mere Humans in PostgreSQL
Query Optimization for Mere Humans in PostgreSQL PostgreSQL: Query Optimization for Mere Humans Understanding a PostgreSQL execution plan with practical examples Photo by Greg Rakozy on Unsplash Today, users have high expectations for the programs they use. Users expect programs to have amazing features, to be fast, and to consume a reasonable amount of resources. As developers,…
-
Becoming a Data Scientist: What I Would Do If I Had to Start Over
Becoming a Data Scientist: What I Would Do If I Had to Start Over Breaking into data science: The Good, the Bad, and the Python Bugs Photo by Markus Spiske on Unsplash Martin Luther King Jr. is famous for his speech, “I Have a Dream.” He delivered it at the Lincoln Memorial in Washington, D.C., on August…
-
Bird’s-Eye View of Linear Algebra: Left, Right Inverse => Injective, Surjective Maps
Bird’s-Eye View of Linear Algebra: Left, Right Inverse => Injective, Surjective Maps If matrix multiplication isn’t commutative, then why don’t we have left and right inverses? Continue reading on Towards Data Science » Rohit Pandey Go to original source
-
The Name That Broke ChatGPT: Who is David Mayer?
The Name That Broke ChatGPT: Who is David Mayer? AI, privacy, human bias, prompting, the future of content, and how to hack a chatbot Continue reading on Towards Data Science » Cassie Kozyrkov Go to original source
-
The Cultural Impact of AI Generated Content: Part 1
The Cultural Impact of AI Generated Content: Part 1 What happens when AI generated media becomes ubiquitous in our lives? How does this relate to what we’ve experienced before, and how does it change us? Photo by Annie Spratt on Unsplash This is the first part of a two part series I’m writing analyzing how people and…
-
Nonlinearity and Uncertainty Informed Moment-Matching Gaussian Mixture Splitting
Nonlinearity and Uncertainty Informed Moment-Matching Gaussian Mixture Splitting arXiv:2412.00343v1 Announce Type: new Abstract: Many problems in navigation and tracking require increasingly accurate characterizations of the evolution of uncertainty in nonlinear systems. Nonlinear uncertainty propagation approaches based on Gaussian mixture density approximations offer distinct advantages over sampling based methods in their computational cost and continuous representation.…
-
Optimal Particle-based Approximation of Discrete Distributions (OPAD)
Optimal Particle-based Approximation of Discrete Distributions (OPAD) arXiv:2412.00545v1 Announce Type: new Abstract: Particle-based methods include a variety of techniques, such as Markov Chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC), for approximating a probabilistic target distribution with a set of weighted particles. In this paper, we prove that for any set of particles, there…
-
Explicit and data-Efficient Encoding via Gradient Flow
Explicit and data-Efficient Encoding via Gradient Flow arXiv:2412.00864v1 Announce Type: new Abstract: The autoencoder model typically uses an encoder to map data to a lower dimensional latent space and a decoder to reconstruct it. However, relying on an encoder for inversion can lead to suboptimal representations, particularly limiting in physical sciences where precision is key.…
-
A Note on Estimation Error Bound and Grouping Effect of Transfer Elastic Net
A Note on Estimation Error Bound and Grouping Effect of Transfer Elastic Net arXiv:2412.01010v1 Announce Type: new Abstract: The Transfer Elastic Net is an estimation method for linear regression models that combines $ell_1$ and $ell_2$ norm penalties to facilitate knowledge transfer. In this study, we derive a non-asymptotic $ell_2$ norm estimation error bound for the…
-
Energy-Based Modelling for Discrete and Mixed Data via Heat Equations on Structured Spaces
Energy-Based Modelling for Discrete and Mixed Data via Heat Equations on Structured Spaces arXiv:2412.01019v1 Announce Type: new Abstract: Energy-based models (EBMs) offer a flexible framework for probabilistic modelling across various data domains. However, training EBMs on data in discrete or mixed state spaces poses significant challenges due to the lack of robust and fast sampling…
-
Google Gemini Is Entering the Advent of Code Challenge
Google Gemini Is Entering the Advent of Code Challenge An open-source project to explore the capabilities and limitations of LLMs on coding challenges Image by author (created with Flux 1.1 Pro) What is this about? If 2024 taught us anything in the realm of Generative AI, then it is that coding is one of the most promising…
-
RAG: Hybrid Search Based on Two Indexes
RAG: Hybrid Search Based on Two Indexes The proposition I will be talking about in this article is something I already have implemented and I am currently testing in a personal… Continue reading on Towards Data Science » Jérôme DIAZ Go to original source
-
Context-Aided Forecasting: Enhancing Forecasting with Textual Data
Context-Aided Forecasting: Enhancing Forecasting with Textual Data A promising alternative approach to improve forecasting Continue reading on Towards Data Science » Nikos Kafritsas Go to original source
-
3D Clustering with Graph Theory: The Complete Guide
3D Clustering with Graph Theory: The Complete Guide Python Tutorial for Euclidean Clustering of 3D Point Clouds with Graph Theory. Fundamental concepts and sequential workflow for… Continue reading on Towards Data Science » Florent Poux, Ph.D. Go to original source
-
Machine Learning Experiments Done Right
Machine Learning Experiments Done Right A detailed guideline for designing machine learning experiments that produce reliable, reproducible results. Photo by Vedrana Filipović on Unsplash Machine learning (ML) practitioners run experiments to compare the effectiveness of methods for both specific applications and for general types of problems. The validity of experimental results hinges on how practitioners design,…
-
The Return of Pseudosciences in Artificial Intelligence: Have Machine Learning and Deep Learning Forgotten Lessons from Statistics and History?
The Return of Pseudosciences in Artificial Intelligence: Have Machine Learning and Deep Learning Forgotten Lessons from Statistics and History? arXiv:2411.18656v1 Announce Type: new Abstract: In today’s world, AI programs powered by Machine Learning are ubiquitous, and have achieved seemingly exceptional performance across a broad range of tasks, from medical diagnosis and credit rating in banking,…
-
Graph Max Shift: A Hill-Climbing Method for Graph Clustering
Graph Max Shift: A Hill-Climbing Method for Graph Clustering arXiv:2411.18794v1 Announce Type: new Abstract: We present a method for graph clustering that is analogous with gradient ascent methods previously proposed for clustering points in space. We show that, when applied to a random geometric graph with data iid from some density with Morse regularity, the…
-
Intrinsic Wrapped Gaussian Process Regression Modeling for Manifold-valued Response Variable
Intrinsic Wrapped Gaussian Process Regression Modeling for Manifold-valued Response Variable arXiv:2411.18989v1 Announce Type: new Abstract: In this paper, we propose a novel intrinsic wrapped Gaussian process regression model for response variable measured on Riemannian manifold. We apply the parallel transport operator to define an intrinsic covariance structure addressing a critical aspect of constructing a well…
-
ABROCA Distributions For Algorithmic Bias Assessment: Considerations Around Interpretation
ABROCA Distributions For Algorithmic Bias Assessment: Considerations Around Interpretation arXiv:2411.19090v1 Announce Type: new Abstract: Algorithmic bias continues to be a key concern of learning analytics. We study the statistical properties of the Absolute Between-ROC Area (ABROCA) metric. This fairness measure quantifies group-level differences in classifier performance through the absolute difference in ROC curves. ABROCA is…