Category: deep-dives

➡️ Start Asking Your Data ‘Why?’ — A Gentle Intro To Causality

➡️ Start Asking Your Data ‘Why?’ — A Gentle Intro To Causality Correlation does not imply causation. It turns out, however, that with some simple ingenious tricks one can, potentially, unveil causal relationships within standard observational data, without having to resort to expensive randomised control trials. This post is targeted towards anyone making data driven…

February 15, 2025
The Gamma Hurdle Distribution

The Gamma Hurdle Distribution Which Outcome Matters? Here is a common scenario : An A/B test was conducted, where a random sample of units (e.g. customers) were selected for a campaign and they received Treatment A. Another sample was selected to receive Treatment B. “A” could be a communication or offer and “B” could be…

February 8, 2025
A Visual Guide to How Diffusion Models Work

A Visual Guide to How Diffusion Models Work This article is aimed at those who want to understand exactly how Diffusion Models work, with no prior knowledge expected. I’ve tried to use illustrations wherever possible to provide visual intuitions on each part of these models. I’ve kept mathematical notation and equations to a minimum, and where…

February 7, 2025
Injecting domain expertise into your AI system

Injecting domain expertise into your AI system How to connect the dots between AI technology and real life (Source: Getty Images) When starting their AI initiatives, many companies are trapped in silos and treat AI as a purely technical enterprise, sidelining domain experts or involving them too late. They end up with generic AI applications that miss…

February 2, 2025
Inequality in Practice: E-commerce Portfolio Analysis

Inequality in Practice: E-commerce Portfolio Analysis From Mathematical Theory to Actionable Insights: A 6-Year Shopify Case Study Image generated by DALL-E, based on author’s prompt, inspired by “The Bremen Town Musicians” Are your top-selling products making or breaking your business? It’s terrifying to think your entire revenue might collapse if one or two products fall out…

February 1, 2025
Analyze Tornado Data with Python and GeoPandas

Analyze Tornado Data with Python and GeoPandas Insights from NOAA’s public domain database Continue reading on Towards Data Science » Lee Vaughan Go to original source

January 29, 2025
Why Generative-AI Apps’ Quality Often Sucks and What to Do About It

Why Generative-AI Apps’ Quality Often Sucks and What to Do About It How to get from PoCs to tested high-quality applications in production Image licensed from elements.envato.com, edit by Marcel Müller, 2025 The generative AI hype has rolled through the business world in the past two years. This technology can make business process executions more efficient,…

January 21, 2025
Where to Start When Data is Limited

Where to Start When Data is Limited A launch pad for projects with small datasets Photo by Google DeepMind: https://www.pexels.com/photo/an-artist-s-illustration-of-artificial-intelligence-ai-this-image-depicts-how-ai-can-help-humans-to-understand-the-complexity-of-biology-it-was-created-by-artist-khyati-trehan-as-part-17484975/ Machine Learning (ML) has driven remarkable breakthroughs in computer vision, natural language processing, and speech recognition, largely due to the abundance of data in these fields. However, many challenges — especially those tied to specific product features or…

January 18, 2025
The AI (R)Evolution, Looking From 2024 Into the Immediate Future

The AI (R)Evolution, Looking From 2024 Into the Immediate Future Witnessing rapid innovation, fierce competition, and transformative tools for life, work, and human development Continue reading on Towards Data Science » LucianoSphere (Luciano Abriata, PhD) Go to original source

January 14, 2025
A Visual Understanding of Neural Networks

A Visual Understanding of Neural Networks The math behind neural networks visually explained Continue reading on Towards Data Science » Reza Bagheri Go to original source

January 12, 2025
GDD: Generative Driven Design

GDD: Generative Driven Design Reflective generative AI software components as a development paradigm Nowhere has the proliferation of generative AI tooling been more aggressive than in the world of software development. It began with GitHub Copilot’s supercharged autocomplete, then exploded into direct code-along integrated tools like Aider and Cursor that allow software engineers to dictate…

January 2, 2025
Top 12 Skills Data Scientists Need to Succeed in 2025

Top 12 Skills Data Scientists Need to Succeed in 2025 It’s (not) all about LLMs and AI tools Continue reading on Towards Data Science » Benjamin Bodner Go to original source

January 1, 2025
A Bird’s-Eye View of Linear Algebra: Orthonormal Matrices

A Bird’s-Eye View of Linear Algebra: Orthonormal Matrices Orthonormal matrices: the most elegant matrices in all of linear algebra. Continue reading on Towards Data Science » Rohit Pandey Go to original source

December 25, 2024
100 Years of (eXplainable) AI

100 Years of (eXplainable) AI Reflecting on advances and challenges in deep learning and explainability in the ever-evolving era of LLMs and AI governance Image by author Background Imagine you are navigating a self-driving car, relying entirely on its onboard computer to make split-second decisions. It detects objects, identifies pedestrians, and even can anticipate behavior of…

December 19, 2024
The Anatomy of an Autonomous Agent

The Anatomy of an Autonomous Agent A blueprint for autonomous agents in an Agentic Mesh ecosystem. Continue reading on Towards Data Science » Eric Broda Go to original source

December 18, 2024
Transformers Key-Value (KV) Caching Explained

Transformers Key-Value (KV) Caching Explained Speed up your LLM inference Continue reading on Towards Data Science » Michał Oleszak Go to original source

December 13, 2024
Missing Data in Time-Series: Machine Learning Techniques

Missing Data in Time-Series: Machine Learning Techniques Part 1: Leverage linear regression and decision trees to impute time-series gaps. Continue reading on Towards Data Science » Sara Nóbrega Go to original source

December 11, 2024
Bridging the Data Literacy Gap

Bridging the Data Literacy Gap The Advent, Evolution, and Current state of “Data Translators” Introduction With Data being constantly glorified as the most valuable asset organizations can own, leaders and decision-makers are always looking for effective ways to put their data insights to use. Every time customers interact with digital products, millions of data points…

December 6, 2024
Introducing Univariate Exemplar Recommenders: how to profile Customer Behavior in a single vector

Introducing Univariate Exemplar Recommenders: how to profile Customer Behavior in a single vector Customer Profiling Surveying and improving the current methodologies for customer profiling ***To understand this article, knowledge of embeddings, clustering, and recommendation systems is required. The implementation of this algorithm has been released on GitHub and is fully open-source. I am open to…

December 5, 2024
The Name That Broke ChatGPT: Who is David Mayer?

The Name That Broke ChatGPT: Who is David Mayer? AI, privacy, human bias, prompting, the future of content, and how to hack a chatbot Continue reading on Towards Data Science » Cassie Kozyrkov Go to original source

December 4, 2024
The Intuition behind Concordance Index — Survival Analysis

The Intuition behind Concordance Index — Survival Analysis The Intuition behind Concordance Index — Survival Analysis Ranking accuracy versus absolute accuracy Taken by the author and her Border Collie. “Be thankful for what you have. Be fearless for what you want” How long would you keep your Gym membership before you decide to cancel it? or Netflix if you are a series…

November 29, 2024
Mistral 7B Explained: Towards More Efficient Language Models

Mistral 7B Explained: Towards More Efficient Language Models RMS Norm, RoPE, GQA, SWA, KV Cache, and more! Part 5 in the “LLMs from Scratch” series — a complete guide to understanding and building Large Language Models. If you are interested in learning more about how these models work I encourage you to read: Part 1: Tokenization — A Complete Guide Part 2:…

November 27, 2024