Tag: how
-
How to Correctly Apply Limits on the Result in DAX (and SQL)
How to Correctly Apply Limits on the Result in DAX (and SQL) What if the output of a measure mustn’t be above a specific limit? How can we ensure that the total is calculated correctly? This piece is about correctly calculating and summarizing such output. The post How to Correctly Apply Limits on the Result…
-
How to Create Powerful LLM Applications with Context Engineering
How to Create Powerful LLM Applications with Context Engineering Improve your LLM by optimizing its context The post How to Create Powerful LLM Applications with Context Engineering appeared first on Towards Data Science. Eivind Kjosbakken Go to original source
-
How different is “Senior Data Analyst” from “Data Scientist”?
How different is “Senior Data Analyst” from “Data Scientist”? I often see Senior DA roles that seem focused on using R/Python for analysis (vs. Excel and Power BI), but don’t have any insight into the day-to-day of theese roles. At the senior level, how different is Data Analyst from Data Scientist? submitted by /u/empirical-sadboy [link]…
-
How to Use LLMs for Powerful Automatic Evaluations
How to Use LLMs for Powerful Automatic Evaluations A beginner-friendly introduction to LLM-as-a-Judge The post How to Use LLMs for Powerful Automatic Evaluations appeared first on Towards Data Science. Eivind Kjosbakken Go to original source
-
How to Design Machine Learning Experiments — the Right Way
How to Design Machine Learning Experiments — the Right Way The key to successful ML projects isn’t always more resources The post How to Design Machine Learning Experiments — the Right Way appeared first on Towards Data Science. TDS Editors Go to original source
-
How to Write Insightful Technical Articles
How to Write Insightful Technical Articles Learn how to write informative technical articles The post How to Write Insightful Technical Articles appeared first on Towards Data Science. Eivind Kjosbakken Go to original source
-
How I Won the “Mostly AI” Synthetic Data Challenge
How I Won the “Mostly AI” Synthetic Data Challenge A deep dive into how post-processing can supercharge synthetic data generation The post How I Won the “Mostly AI” Synthetic Data Challenge appeared first on Towards Data Science. Daniel Gärber Go to original source
-
How a Research Lab Made Entirely of LLM Agents Developed Molecules That Can Block a Virus
How a Research Lab Made Entirely of LLM Agents Developed Molecules That Can Block a Virus Welcome to the 21st century by the hand of large language models and reasoning AI agents The post How a Research Lab Made Entirely of LLM Agents Developed Molecules That Can Block a Virus appeared first on Towards Data…
-
How Computers “See” Molecules
How Computers “See” Molecules Generative Molecular Design (Part 1): common molecular representations in data science. The post How Computers “See” Molecules appeared first on Towards Data Science. Tianyuan Zheng Go to original source
-
How to Benchmark LLMs – ARC AGI 3
How to Benchmark LLMs – ARC AGI 3 Learn how to LLMs are benchmarked, and try out the newly released ARC AGI 3 The post How to Benchmark LLMs – ARC AGI 3 appeared first on Towards Data Science. Eivind Kjosbakken Go to original source
-
How Your Prompts Lead AI Astray
How Your Prompts Lead AI Astray Practical tips to recognise and avoid prompt bias. The post How Your Prompts Lead AI Astray appeared first on Towards Data Science. Daphne de Klerk Go to original source
-
How to Evaluate Graph Retrieval in MCP Agentic Systems
How to Evaluate Graph Retrieval in MCP Agentic Systems A framework for measuring retrieval quality in Model Context Protocol agents. The post How to Evaluate Graph Retrieval in MCP Agentic Systems appeared first on Towards Data Science. Tomaz Bratanic Go to original source
-
How I Fine-Tuned Granite-Vision 2B to Beat a 90B Model — Insights and Lessons Learned
How I Fine-Tuned Granite-Vision 2B to Beat a 90B Model — Insights and Lessons Learned A hands-on journey exploring fine-tuning techniques that unlock the power of small vision models. The post How I Fine-Tuned Granite-Vision 2B to Beat a 90B Model — Insights and Lessons Learned appeared first on Towards Data Science. Julio Sanchez Go…
-
How Do Grayscale Images Affect Visual Anomaly Detection?
How Do Grayscale Images Affect Visual Anomaly Detection? A practical exploration focusing on performance and speed The post How Do Grayscale Images Affect Visual Anomaly Detection? appeared first on Towards Data Science. Aimira Baitieva Go to original source
-
How Not to Mislead with Your Data-Driven Story
How Not to Mislead with Your Data-Driven Story Data storytelling can enlighten—but it can also deceive. When persuasive narratives meet biased framing, cherry-picked data, or misleading visuals, insights risk becoming illusions. This article explores the hidden biases embedded in data-driven storytelling—from the seduction of beautiful charts to the quiet influence of AI-generated insights—and offers practical…
-
From Rules to Relationships: How Machines Are Learning to Understand Each Other
From Rules to Relationships: How Machines Are Learning to Understand Each Other Using knowledge graphs to handle the unexpected in semantic communication The post From Rules to Relationships: How Machines Are Learning to Understand Each Other appeared first on Towards Data Science. Shireesh Kumar Singh Go to original source
-
How would you structure a project (data frame) to scrape and track listing changes over time?
How would you structure a project (data frame) to scrape and track listing changes over time? I’m working on a project where I want to scrape data daily (e.g., real estate listings from a site like RentFaster or Zillow) and track how each listing changes over time. I want to be able to answer questions…
-
How to Ensure Reliability in LLM Applications
How to Ensure Reliability in LLM Applications Learn how to make your LLM applications more robust The post How to Ensure Reliability in LLM Applications appeared first on Towards Data Science. Eivind Kjosbakken Go to original source
-
How Metrics (and LLMs) Can Trick You: A Field Guide to Paradoxes
How Metrics (and LLMs) Can Trick You: A Field Guide to Paradoxes When numbers lie — and your metrics mislead you The post How Metrics (and LLMs) Can Trick You: A Field Guide to Paradoxes appeared first on Towards Data Science. Subha Ganapathi Go to original source
-
How much DSA for FAANG+ ?
How much DSA for FAANG+ ? Hello all, I am going to be graduating in 6 months and have been practicing Leetcode as I believe this to be my weakest point. I have solved 250 LC with 130 Easy and 120 Hard, covering concepts like arrays, hashing, binary trees, SQL, linked list, two pointers, stack,…
-
How do you efficiently traverse hundreds of features in the dataset?
How do you efficiently traverse hundreds of features in the dataset? Currently, working on a fintech classification algorithm, with close to a thousand features which is very tiresome. I’m not a domain expert, so creating sensible hypotesis is difficult. How do you tackle EDA and forming reasonable hypotesis in these cases? Even with proper documentation…
-
How to Perform Effective Data Cleaning for Machine Learning
How to Perform Effective Data Cleaning for Machine Learning Learn how you can improve your machine learning models using effective data cleaning The post How to Perform Effective Data Cleaning for Machine Learning appeared first on Towards Data Science. Eivind Kjosbakken Go to original source
-
How to Fine-Tune Small Language Models to Think with Reinforcement Learning
How to Fine-Tune Small Language Models to Think with Reinforcement Learning A visual tour and from-scratch guide to train GRPO reasoning models in PyTorch The post How to Fine-Tune Small Language Models to Think with Reinforcement Learning appeared first on Towards Data Science. Avishek Biswas Go to original source
-
How to Access NASA’s Climate Data — And How It’s Powering the Fight Against Climate Change Pt. 1
How to Access NASA’s Climate Data — And How It’s Powering the Fight Against Climate Change Pt. 1 From architectural design to food security. The post How to Access NASA’s Climate Data — And How It’s Powering the Fight Against Climate Change Pt. 1 appeared first on Towards Data Science. Marco Hening Tallarico Go to…
-
From Pixels to Plots
From Pixels to Plots How I built an AI-powered prototype to turn images into insights The post From Pixels to Plots appeared first on Towards Data Science. Jens Winkelmann Go to original source
-
How to Train a Chatbot Using RAG and Custom Data
How to Train a Chatbot Using RAG and Custom Data Retrieval-Augmented Generation made easy with Llama The post How to Train a Chatbot Using RAG and Custom Data appeared first on Towards Data Science. Haden Pelletier Go to original source
-
How AI Agents “Talk” to Each Other
How AI Agents “Talk” to Each Other Minimize chaos and maintain inter-agent harmony in your projects The post How AI Agents “Talk” to Each Other appeared first on Towards Data Science. TDS Editors Go to original source
-
How to Transition From Data Analyst to Data Scientist
How to Transition From Data Analyst to Data Scientist Playbook on how data analysts can become data scientists The post How to Transition From Data Analyst to Data Scientist appeared first on Towards Data Science. Egor Howell Go to original source
-
How I Automated My Machine Learning Workflow with Just 10 Lines of Python
How I Automated My Machine Learning Workflow with Just 10 Lines of Python Use LazyPredict and PyCaret to skip the grunt work and jump straight to performance. The post How I Automated My Machine Learning Workflow with Just 10 Lines of Python appeared first on Towards Data Science. Himanshu Sharma Go to original source
-
LLMs + Pandas: How I Use Generative AI to Generate Pandas DataFrame Summaries
LLMs + Pandas: How I Use Generative AI to Generate Pandas DataFrame Summaries Local Large Language Models can convert massive DataFrames to presentable Markdown reports — here’s how. The post LLMs + Pandas: How I Use Generative AI to Generate Pandas DataFrame Summaries appeared first on Towards Data Science. Dario Radečić Go to original source
-
How Microsoft Power BI Elevated My Data Analysis and Visualization Workflow
How Microsoft Power BI Elevated My Data Analysis and Visualization Workflow Explaining useful features every data analyst needs The post How Microsoft Power BI Elevated My Data Analysis and Visualization Workflow appeared first on Towards Data Science. Benjamin Nweke Go to original source
-
How to Generate Synthetic Data: A Comprehensive Guide Using Bayesian Sampling and Univariate Distributions
How to Generate Synthetic Data: A Comprehensive Guide Using Bayesian Sampling and Univariate Distributions Data makes the engine run in many organisations. But what if the number of observations is too low or there is only expert knowledge? I will demonstrate how to generate synthetic data with applications in predictive maintenance. The post How to…
-
How to Evaluate LLMs and Algorithms — The Right Way
How to Evaluate LLMs and Algorithms — The Right Way Never miss a new edition of The Variable, our weekly newsletter featuring a top-notch selection of editors’ picks, deep dives, community news, and more. Subscribe today! All the hard work it takes to integrate large language models and powerful algorithms into your workflows can go to waste…
-
Survival Analysis When No One Dies: A Value-Based Approach
Survival Analysis When No One Dies: A Value-Based Approach Survival Analysis is a statistical approach used to answer the question: “How long will something last?” That “something” could range from a patient’s lifespan to the durability of a machine component or the duration of a user’s subscription. One of the most widely used tools in…
-
How I Built Business-Automating Workflows with AI Agents
How I Built Business-Automating Workflows with AI Agents AI agents and automation are no longer just a trend — they are transforming how companies operate. In a previous article, I shared several case studies of AI Agents supporting the sustainability roadmaps of small, medium and large companies. AI Agents for Sustainability — (Image by Samir Saci) This is part of a…
-
Why Most Cyber Risk Models Fail Before They Begin
Why Most Cyber Risk Models Fail Before They Begin Cybersecurity leaders are being asked impossible questions. “What’s the likelihood of a breach this year?” “How much would it cost?” And “how much should we spend to stop it?” Yet most risk models used today are still built on guesswork, gut instinct, and colorful heatmaps, not…
-
Are We Watching More Ads Than Content? Analyzing YouTube Sponsor Data
Are We Watching More Ads Than Content? Analyzing YouTube Sponsor Data I’m definitely not the only person who feels that YouTube sponsor segments have become longer and more frequent recently. Sometimes, I watch videos that seem to be trying to sell me something every couple of seconds. On one hand, it’s great that both small and…
-
From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities
From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities Introduction: Can AI really distinguish dog breeds like human experts? One day while taking a walk, I saw a fluffy white puppy and wondered, Is that a Bichon Frise or a Maltese? No matter how closely I looked, they seemed almost identical.…
-
What Germany Currently Is Up To, Debt-Wise
What Germany Currently Is Up To, Debt-Wise €1,600 per second. That’s how much interest Germany has to pay for its debts. In total, the German state has debts ranging into the trillions — more than a thousand billion Euros. And the government is planning to make even more, up to one trillion additional debt is…
-
Learning Pareto manifolds in high dimensions: How can regularization help?
Learning Pareto manifolds in high dimensions: How can regularization help? arXiv:2503.08849v1 Announce Type: new Abstract: Simultaneously addressing multiple objectives is becoming increasingly important in modern machine learning. At the same time, data is often high-dimensional and costly to label. For a single objective such as prediction risk, conventional regularization techniques are known to improve generalization…
-
How to Develop Complex DAX Expressions
How to Develop Complex DAX Expressions At some point or another, any Power BI developer must write complex Dax expressions to analyze data. But nobody tells you how to do it. What’s the process for doing it? What is the best way to do it, and how supportive can a development process be? These are the questions…
-
From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities
From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities Introduction: Can AI really distinguish dog breeds like human experts? One day while taking a walk, I saw a fluffy white puppy and wondered, Is that a Bichon Frise or a Maltese? No matter how closely I looked, they seemed almost identical.…
-
Experiments Illustrated: How Random Assignment Saved Us $1M in Marketing Spend
Experiments Illustrated: How Random Assignment Saved Us $1M in Marketing Spend Running cool experiments is easily one of my favorite parts of working in data science. Most experiments don’t deliver big wins, so the winners make for fun stories. We’ve had a few of these at IntelyCare, and I’m sharing each story in a way…
-
Experiments Illustrated: How We Optimized Premium Listings on Our Nursing Job Board
Experiments Illustrated: How We Optimized Premium Listings on Our Nursing Job Board Running experiments is a task that often falls to data scientists. If that’s you, congrats! It can be a rewarding and high-impact area of work, but also requires tools found outside the typical ML-heavy data science curriculum. Even with the best tools, only…
-
Generative AI Is Declarative
Generative AI Is Declarative ChatGPT launched in 2022 and kicked off the Generative Ai boom. In the two years since, academics, technologists, and armchair experts have written libraries worth of articles on the technical underpinnings of generative AI and about the potential capabilities of both current and future generative AI models. Surprisingly little has been…
-
How to Train LLMs to “Think” (o1 & DeepSeek-R1)
How to Train LLMs to “Think” (o1 & DeepSeek-R1) In September 2024, OpenAI released its o1 model, trained on large-scale reinforcement learning, giving it “advanced reasoning” capabilities. Unfortunately, the details of how they pulled this off were never shared publicly. Today, however, DeepSeek (an AI research lab) has replicated this reasoning behavior and published the…
-
Learnings from a Machine Learning Engineer — Part 4: The Model
Learnings from a Machine Learning Engineer — Part 4: The Model In this latest part of my series, I will share what I have learned on selecting a model for Image Classification and how to fine tune that model. I will also show how you can leverage the model to accelerate your labelling process, and…
-
How to Measure the Reliability of a Large Language Model’s Response
How to Measure the Reliability of a Large Language Model’s Response The basic principle of Large Language Models (LLMs) is very simple: to predict the next word (or token) in a sequence of words based on statistical patterns in their training data. However, this seemingly simple capability turns out to be incredibly sophisticated when it…
-
How Likely Is a Six Nations Grand Slam in 2025?
How Likely Is a Six Nations Grand Slam in 2025? Quantifying uncertainty in sports fixtures Photo by Thomas Serer on Unsplash Introduction For rugby fans the long wait is nearly over, like Christmas the Six Nations comes once a year to lift our spirits in the cold winter months. If you’re not very familiar with rugby, the…
-
How to do Date calculations in DAX
How to do Date calculations in DAX Moving back and forth in time is a common task for Time Intelligence in DAX. Let’s take a deeper look on how DATEADD() works. Continue reading on Towards Data Science » Salvatore Cagliari Go to original source
-
How Cheap Mortgages Transformed Poland’s Real Estate Market
How Cheap Mortgages Transformed Poland’s Real Estate Market Insights from a synthetic control group Continue reading on Towards Data Science » Lukasz Szubelak Go to original source
-
How to Utilize ModernBERT and Synthetic Data for Robust Text Classification
How to Utilize ModernBERT and Synthetic Data for Robust Text Classification Learn how to fine-tune ModernBERT and create augmentations of text samples Continue reading on Towards Data Science » Eivind Kjosbakken Go to original source
-
Understanding the Evolution of ChatGPT: Part 3— Insights from Codex and InstructGPT
Understanding the Evolution of ChatGPT: Part 3— Insights from Codex and InstructGPT Mastering the art of fine-tuning: Learnings for training your own LLMs. (Image from Unsplash) This is the third article in our GPT series, and also the most practical one: finally, we will talk about how to effectively fine-tune LLMs. It is practical in the…
-
How to Use Pre-Trained Language Models for Regression
How to Use Pre-Trained Language Models for Regression Why and how to convert mT5 into a regression metric for numerical prediction Continue reading on Towards Data Science » Aden Haussmann Go to original source
-
How To: Forecast Time Series Using Lags
How To: Forecast Time Series Using Lags Lag columns can significantly boost your model’s performance Continue reading on Towards Data Science » Haden Pelletier Go to original source
-
How we matured Fisher, our A/B testing library
How we matured Fisher, our A/B testing library submitted by /u/chomoloc0 [link] [comments] /u/chomoloc0 Go to original source
-
How to Run Jupyter Notebooks and Generate HTML Reports with Python Scripts
How to Run Jupyter Notebooks and Generate HTML Reports with Python Scripts A step-by-step guide to automating Jupyter Notebook execution and report generation using Python Continue reading on Towards Data Science » Amanda Iglesias Moreno Go to original source
-
How Recurrent Neural Networks (RNNs) Are Revolutionizing Decision-Making Research
How Recurrent Neural Networks (RNNs) Are Revolutionizing Decision-Making Research A deep dive into the world of computational modeling and its applications Continue reading on Towards Data Science » Kaushik Rajan Go to original source
-
How to Tell Among Two Regression Models with Statistical Significance
How to Tell Among Two Regression Models with Statistical Significance Diving into the F-test for nested models with algorithms, examples and code Continue reading on Towards Data Science » LucianoSphere (Luciano Abriata, PhD) Go to original source
-
How to Stand Out in The Data Science Job Market
How to Stand Out in The Data Science Job Market How to have the edge in your data science application Continue reading on Towards Data Science » Egor Howell Go to original source
-
Transforming Data into Solutions: Building a Smart App with Python and AI
Transforming Data into Solutions: Building a Smart App with Python and AI Some financial analysts worry that artificial intelligence may not justify the massive investments being made in the field. While I understand their concerns, I see things differently. I’m neither an AI Boomer nor an AI Doomer — I believe AI has the potential to drive…
-
Partial Dependence Plots: How to Discover Variables Influencing a Model
Partial Dependence Plots: How to Discover Variables Influencing a Model Have you ever wondered how machine learning models are constructed? ‘Explainability of machine learning models’ and ‘machine learning… Continue reading on Towards Data Science » Mythili Krishnan Go to original source
-
How to Ensure the Stability of a Model Using Jackknife Estimation
How to Ensure the Stability of a Model Using Jackknife Estimation How to ensure the robustness of a model and detect influential data observations Continue reading on Towards Data Science » Paula LC Go to original source
-
How To Start A Data Science Blog on Medium
How To Start A Data Science Blog on Medium Tips on how to get started, write your first article, and get noticed Continue reading on Towards Data Science » Haden Pelletier Go to original source
-
How to Clean Your Data for Your Real-Life Data Science Projects
How to Clean Your Data for Your Real-Life Data Science Projects How I treat missing values—with a quick Python Guide Continue reading on Towards Data Science » Mythili Krishnan Go to original source
-
How (and Where) ML Beginners Can Find Papers
How (and Where) ML Beginners Can Find Papers From conferences to surveys Continue reading on Towards Data Science » Pascal Janetzky Go to original source
-
How to Stand Out as a Junior Data Scientist
How to Stand Out as a Junior Data Scientist 7 things you can do to show your skills even if you have no experience at all Continue reading on Towards Data Science » Idit Cohen Go to original source
-
How Have Data Science Interviews Changed Over 4 Years?
How Have Data Science Interviews Changed Over 4 Years? An aggregated look on the differences between then & now: 2020 vs 2024 — some big frustrations and positive learnings. Continue reading on Towards Data Science » Matt Przybyla Go to original source
-
How to Apply the Central Limit Theorem to Constrained Data
How to Apply the Central Limit Theorem to Constrained Data What can we say about the mean of data distributed in an interval [a, b]? Continue reading on Towards Data Science » Ryan Burn Go to original source
-
How to Evaluate Multilingual LLMs With Global-MMLU
How to Evaluate Multilingual LLMs With Global-MMLU Evaluation of language-specific LLM accuracy on the global Massive Multitask Language Understanding benchmark in Python Continue reading on Towards Data Science » Dr. Leon Eversberg Go to original source
-
How to find freelance opportunities – what is the most typical troupe of project you do as freelance
How to find freelance opportunities – what is the most typical troupe of project you do as freelance Hi all, I have 5+ years of experience. I’m based in Europe Lately I’m thinking switch from full time employee to contractor, doing freelancing and working for different companies at the same time. I think that freelancing…
-
Modeling DAU with Markov Chain
Modeling DAU with Markov Chain How to predict DAU using Duolingo’s growth model and control the prediction 1. Introduction Doubtlessly, DAU, WAU, and MAU — daily, weekly, and monthly active users — are critical business metrics. An article “How Duolingo reignited user growth” by Jorge Mazal, former CPO of Duolingo, is #1 in the Growth section of Lenny’s Newsletter…
-
How to Integrate AI and Data Science into Your Business Strategy
How to Integrate AI and Data Science into Your Business Strategy DATA SCIENCE CONSULTING Insider consulting guide to conducting a successful 2-day executive workshop Image by author using Canva “Our industry does not respect tradition — it only respects innovation.” — Satya Nadella, CEO Microsoft, Letter to employees in 2014 While not all industries are as competitive and cutthroat as the…
-
How to Solve a Simple Problem With Machine Learning
How to Solve a Simple Problem With Machine Learning A technical walkthrough of lesson one Continue reading on Towards Data Science » Oscar Leo Go to original source
-
How to Prune LLaMA 3.2 and Similar Large Language Models
How to Prune LLaMA 3.2 and Similar Large Language Models This article explores a structured pruning technique for state-of-the-art models, that uses a GLU architecture, enabling the creation of… Continue reading on Towards Data Science » Pere Martra Go to original source
-
How to Develop an Effective AI-Powered Legal Assistant
How to Develop an Effective AI-Powered Legal Assistant Create a machine-learning-based search into legal decisions Continue reading on Towards Data Science » Eivind Kjosbakken Go to original source
-
how does btrfs do it?
https://github.com/markfasheh/duperemove https://www.jdupes.com/ https://despairlabs.com/blog/posts/2024-10-27-openzfs-dedup-is-good-dont-use-it/