Category: python

  • What Statistics Can Tell Us About NBA Coaches

    What Statistics Can Tell Us About NBA Coaches Using Python to determine where NBA coaches come from and what makes them successful The post What Statistics Can Tell Us About NBA Coaches appeared first on Towards Data Science. Brayden Gerrard Go to original source

  • Use PyTorch to Easily Access Your GPU

    Use PyTorch to Easily Access Your GPU Let’s say you are lucky enough to have access to a system with an Nvidia Graphical Processing Unit (Gpu). Did you know there is an absurdly easy method to use your GPU’s capabilities using a Python library intended and predominantly used for machine learning (ML) applications?  Don’t worry…

  • Building AI Applications in Ruby

    Building AI Applications in Ruby This is the second in a multi-part series on creating web applications with generative AI integration. Part 1 focused on explaining the AI stack and why the application layer is the best place in the stack to be. Check it out here. Table of Contents Introduction I thought spas were supposed…

  • Optimizing Multi-Objective Problems with Desirability Functions

    Optimizing Multi-Objective Problems with Desirability Functions When working in Data Science, it is not uncommon to encounter problems with competing objectives. Whether designing products, tuning algorithms or optimizing portfolios, we often need to balance several metrics to get the best possible outcome. Sometimes, maximizing one metrics comes at the expense of another, making it hard…

  • Understanding Random Forest using Python (scikit-learn)

    Understanding Random Forest using Python (scikit-learn) Decision trees are a popular supervised learning algorithm with benefits that include being able to be used for both regression and classification as well as being easy to interpret. However, decision trees aren’t the most performant algorithm and are prone to overfitting due to small variations in the training…

  • Strength in Numbers: Ensembling Models with Bagging and Boosting

    Strength in Numbers: Ensembling Models with Bagging and Boosting Bagging and boosting are two powerful ensemble techniques in machine learning – they are must-knows for data scientists! After reading this article, you are going to have a solid understanding of how bagging and boosting work and when to use them. We’ll cover the following topics,…

  • Pause Your ML Pipelines for Human Review Using AWS Step Functions + Slack

    Pause Your ML Pipelines for Human Review Using AWS Step Functions + Slack Have you ever wanted to pause an automated workflow to wait for a human decision? Maybe you need approval before provisioning cloud resources, promoting a machine learning model to production, or charging a customer’s credit card. In many data science and machine learning…

  • Running Python Programs in Your Browser

    Running Python Programs in Your Browser In recent years, WebAssembly (often abbreviated as WASM) has emerged as an interesting technology that extends web browsers’ capabilities far beyond the traditional realms of HTML, CSS, and JavaScript.  As a Python developer, one particularly exciting application is the ability to run Python code directly in the browser. In this…

  • Time Series Forecasting Made Simple (Part 2): Customizing Baseline Models

    Time Series Forecasting Made Simple (Part 2): Customizing Baseline Models Thank you for the kind response to Part 1, it’s been encouraging to see so many readers interested in time series forecasting. In Part 1 of this series, we broke down time series data into trend, seasonality, and noise, discussed when to use additive versus…

  • Clustering Eating Behaviors in Time: A Machine Learning Approach to Preventive Health

    Clustering Eating Behaviors in Time: A Machine Learning Approach to Preventive Health It’s well known that what we eat matters — but what if when and how often we eat matters just as much? In the midst of ongoing scientific debate around the benefits of intermittent fasting, this question becomes even more intriguing. As someone passionate about machine learning and healthy living,…

  • Generating Data Dictionary for Excel Files Using OpenPyxl and AI Agents

    Generating Data Dictionary for Excel Files Using OpenPyxl and AI Agents Introduction Every company I worked for until today, there it was: the resilient MS Excel. Excel was first released in 1985 and has remained strong until today. It has survived the rise of relational databases, the evolution of many programming languages, the Internet with…

  • Real-Time Interactive Sentiment Analysis in Python

    Real-Time Interactive Sentiment Analysis in Python You know what the best part of being an engineer is? You can just build stuff. It’s like a superpower. One rainy afternoon I had this random idea of creating a sentiment visualization of a text input with a smiley face that changes it’s expression base on how positive…

  • From RGB to HSV — and Back Again

    From RGB to HSV — and Back Again Introduction A fundamental concept in Computer Vision is understanding how images are stored and represented. On disk, image files are encoded in various ways, from lossy, compressed JPEG files to lossless PNG files. Once you load an image into a program and decode it from the respective…

  • Retrieval Augmented Classification: Improving Text Classification with External Knowledge

    Retrieval Augmented Classification: Improving Text Classification with External Knowledge Text Classification stands as one of the most basic yet most important applications of natural language processing. It has a vital role in many real-world applications that go from filtering unwanted emails like spam, detecting product categories or classifying user intent in a chat-bot application. The…

  • Making Sense of KPI Changes

    Making Sense of KPI Changes As analysts, we are usually monitoring metrics. Quite often, metrics change. And when they do, it’s our job to figure out what’s going on: why did the conversion rate suddenly drop, or what is driving consistent revenue growth? I started my journey in data analytics as a Kpi analyst. For almost…

  • Fine-Tuning vLLMs for Document Understanding

    Fine-Tuning vLLMs for Document Understanding In this article, I discuss how you can fine-tune VLMs (visual large language models, often called vLLMs) like Qwen 2.5 VL 7B. I will introduce you to a dataset of handwritten digits, which the base version of Qwen 2.5 VL struggles with. We will then inspect the dataset, annotate it,…

  • Why I stopped Using Cursor and Reverted to VSCode

    Why I stopped Using Cursor and Reverted to VSCode Introduction In December 2024, I wrote an article sharing my experience using VSCode (GitHub Copilot) and Cursor (Claude 3.5 Sonnet) from the perspective of a Data Scientist. Should you switch from VSCode to Cursor? I concluded the article by stating: After using Cursor for the past two…

  • Rust for Python Developers: Why You Should Take a Look at the Rust Programming Language

    Rust for Python Developers: Why You Should Take a Look at the Rust Programming Language The programming language Rust is now appearing in many feeds as it offers a performant and secure way to write programs and places great emphasis on performance. If you come from the Python world of Pandas, Jupyter or Flask, you might think that…

  • Modern GUI Applications for Computer Vision in Python

    Modern GUI Applications for Computer Vision in Python Introduction I’m a huge fan of interactive visualizations. As a computer vision engineer, I deal almost daily with image processing related tasks and more often than not I am iterating on a problem where I need visual feedback to make decisions. Let’s think of a very simple image…

  • NumExpr: The “Faster than Numpy” Library Most Data Scientists Have Never Used

    NumExpr: The “Faster than Numpy” Library Most Data Scientists Have Never Used Browsing GitHub the other day, I came across a library I’d never heard of before. It was called NumExpr. I was immediately interested because of some claims made about the library. In particular, it stated that for some complex numerical calculations, it was…

  • How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals

    How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals The recent launch of the DeepSeek-R1 model sent ripples across the global AI community. It delivered breakthroughs on par with the reasoning models from Meta and OpenAI, achieving this in a fraction of the time and at a significantly lower cost. Beyond…

  • Data Science: From School to Work, Part IV

    Data Science: From School to Work, Part IV Introduction Let’s start with a simple example that will appeal to most of us. If you want to check if the blinkers of your car are working properly, you sit in the car, turn on the ignition and test a turn signal to see if the front…

  • Building a Personal API for Your Data Projects with FastAPI

    Building a Personal API for Your Data Projects with FastAPI How many times have you had a messy Jupyter Notebook filled with copy-pasted code just to re-use some data wrangling logic? Whether you do it for passion or for work, if you code a lot, then you’ve probably answered something like “way too many”. You’re…

  • Beyond the Code: Unconventional Lessons from Empathetic Interviewing

    Beyond the Code: Unconventional Lessons from Empathetic Interviewing Recently, I’ve been interviewing Computer Science students applying for data science and engineering internships with a 4-day turnaround from CV vetting to final decisions. With a small local office of 10 and no in-house HR, hiring managers handle the entire process. This article reflects on the lessons…

  • When Physics Meets Finance: Using AI to Solve Black-Scholes

    When Physics Meets Finance: Using AI to Solve Black-Scholes DISCLAIMER: This is not financial advice. I’m a PhD in Aerospace Engineering with a strong focus on Machine Learning: I’m not a financial advisor. This article is intended solely to demonstrate the power of Physics-Informed Neural Networks (PINNs) in a financial context. When I was 16,…

  • When Predictors Collide: Mastering VIF in Multicollinear Regression

    When Predictors Collide: Mastering VIF in Multicollinear Regression In regression models, the independent variables must be not or only slightly dependent on each other, i.e. that they are not correlated. However, if such a dependency exists, this is referred to as Multicollinearity and leads to unstable models and results that are difficult to interpret. The…

  • Are You Sure Your Posterior Makes Sense?

    Are You Sure Your Posterior Makes Sense? This article is co-authored by Felipe Bandeira, Giselle Fretta, Thu Than, and Elbion Redenica. We also thank Prof. Carl Scheffler for his support. Introduction Parameter estimation has been for decades one of the most important topics in statistics. While frequentist approaches, such as Maximum Likelihood Estimations, used to…

  • The Invisible Revolution: How Vectors Are (Re)defining Business Success

    The Invisible Revolution: How Vectors Are (Re)defining Business Success In a world that focuses more on data, business leaders must understand vector thinking. At first, vectors may appear as complicated as algebra was in school, but they serve as a fundamental building block. Vectors are as essential as algebra for tasks like sharing a bill…

  • Time Series Forecasting Made Simple (Part 1): Decomposition and Baseline Models

    Time Series Forecasting Made Simple (Part 1): Decomposition and Baseline Models I used to avoid time series analysis. Every time I took an online course, I’d see a module titled “Time Series Analysis” with subtopics like Fourier Transforms, autocorrelation functions and other intimidating terms. I don’t know why, but I always found a reason to avoid…

  • Mining Rules from Data

    Mining Rules from Data Working with products, we might face a need to introduce some “rules”. Let me explain what I mean by “rules” in practical examples:  Imagine that we’re seeing a massive wave of fraud in our product, and we want to restrict onboarding for a particular segment of customers to lower this risk. For…

  • How to Optimize your Python Program for Slowness

    How to Optimize your Python Program for Slowness Also available: A Rust version of this article. Everyone talks about making Python programs faster [1, 2, 3], but what if we pursue the opposite goal? Let’s explore how to make them slower — absurdly slower. Along the way, we’ll examine the nature of computation, the role of memory,…

  • How I Would Learn To Code (If I Could Start Over)

    How I Would Learn To Code (If I Could Start Over) According to various sources, the average salary for Coding jobs is ~£47.5k in the UK, which is ~35% higher than the median salary of about £35k. So, coding is a very valuable skill that will earn you more money, not to mention it’s really fun.…

  • Creating an AI Agent to Write Blog Posts with CrewAI

    Creating an AI Agent to Write Blog Posts with CrewAI Introduction I love writing. You may notice that if you follow me or my blog. For that reason, I am constantly producing new content and talking about Data Science and Artificial Intelligence. I discovered this passion a couple of years ago when I was just…

  • A Simple Implementation of the Attention Mechanism from Scratch

    A Simple Implementation of the Attention Mechanism from Scratch Introduction The Attention Mechanism is often associated with the transformer architecture, but it was already used in RNNs. In Machine Translation or MT (e.g., English-Italian) tasks, when you want to predict the next Italian word, you need your model to focus, or pay attention, on the…

  • Create Your Supply Chain Analytics Portfolio to Land Your Dream Job

    Create Your Supply Chain Analytics Portfolio to Land Your Dream Job Supply chains are under pressure like never before. From climate-driven disruptions to geopolitical shifts, businesses must adapt to rising costs, new trade barriers and growing sustainability demands. In this new world where supply chains face uncertainty, Supply Chain Analytics is essential to keep resilient operations. Samir, can…

  • Master the 3D Reconstruction Process: A Step-by-Step Guide

    Master the 3D Reconstruction Process: A Step-by-Step Guide The 3d Reconstruction journey from 2D photographs to 3D models follows a structured path.  This path consists of distinct steps that build upon each other to transform flat images into spatial information.  Understanding this pipeline is crucial for anyone looking to create high-quality 3D reconstructions. Let me…

  • Data Science: From School to Work, Part III

    Data Science: From School to Work, Part III Introduction Writing code is about solving problems, but not every problem is predictable. In the real world, your software will encounter unexpected situations: missing files, invalid user inputs, network timeouts, or even hardware failures. This is why handling errors isn’t just a nice-to-have; it’s a critical part…

  • AI Agents from Zero to Hero — Part 2

    AI Agents from Zero to Hero — Part 2 Intro In Part 1 of this tutorial series, we introduced AI Agents, autonomous programs that perform tasks, make decisions, and communicate with others.  Agents perform actions through Tools. It might happen that a Tool doesn’t work on the first try, or that multiple Tools must be…

  • The Ultimate AI/ML Roadmap For Beginners

    The Ultimate AI/ML Roadmap For Beginners AI is transforming the way businesses operate, and nearly every company is exploring how to leverage this technology. As a result, the demand for AI and machine learning skills has skyrocketed in recent years. With nearly four years of experience in AI/ML, I’ve decided to create the ultimate guide…

  • What Germany Currently Is Up To, Debt-Wise

    What Germany Currently Is Up To, Debt-Wise €1,600 per second. That’s how much interest Germany has to pay for its debts. In total, the German state has debts ranging into the trillions — more than a thousand billion Euros. And the government is planning to make even more, up to one trillion additional debt is…

  • Linear Regression in Time Series: Sources of Spurious Regression

    Linear Regression in Time Series: Sources of Spurious Regression 1. Introduction It’s pretty clear that most of our work will be automated by AI in the future. This will be possible because many researchers and professionals are working hard to make their work available online. These contributions not only help us understand fundamental concepts but…

  • Comprehensive Guide to Dependency Management in Python

    Comprehensive Guide to Dependency Management in Python Introduction When learning Python, many beginners focus solely on the language and its libraries while completely ignoring virtual environments. As a result, managing Python projects can become a mess: dependencies installed for different projects may have conflicting versions, leading to compatibility issues. Even when I studied Python, nobody…

  • How to Spot and Prevent Model Drift Before it Impacts Your Business

    How to Spot and Prevent Model Drift Before it Impacts Your Business Despite the AI hype, many tech companies still rely heavily on machine learning to power critical applications, from personalized recommendations to fraud detection.  I’ve seen firsthand how undetected drifts can result in significant costs — missed fraud detection, lost revenue, and suboptimal business…

  • LLM + RAG: Creating an AI-Powered File Reader Assistant

    LLM + RAG: Creating an AI-Powered File Reader Assistant Introduction AI is everywhere.  It is hard not to interact at least once a day with a Large Language Model (LLM). The chatbots are here to stay. They’re in your apps, they help you write better, they compose emails, they read emails…well, they do a lot.…

  • Data Science: From School to Work, Part II

    Data Science: From School to Work, Part II In my previous article, I highlighted the importance of effective project management in Python development. Now, let’s shift our focus to the code itself and explore how to write clean, maintainable code — an essential practice in professional and collaborative environments.  Readability & Maintainability: Well-structured code is easier to…

  • Debugging the Dreaded NaN

    Debugging the Dreaded NaN You are training your latest AI model, anxiously watching as the loss steadily decreases when suddenly — boom! Your logs are flooded with NaNs (Not a Number) — your model is irreparably corrupted and you’re left staring at your screen in despair. To make matters worse, the NaNs don’t appear consistently.…

  • Is Python Set to Surpass Its Competitors?

    Is Python Set to Surpass Its Competitors? A soufflé is a baked egg dish that originated in France in the 18th century. The process of making an elegant and delicious French soufflé is complex, and in the past, it was typically only prepared by professional French pastry chefs. However, with pre-made soufflé mixes now widely…

  • Efficient Data Handling in Python with Arrow

    Efficient Data Handling in Python with Arrow 1. Introduction We’re all used to work with CSVs, JSON files… With the traditional libraries and for large datasets, these can be extremely slow to read, write and operate on, leading to performance bottlenecks (been there). It’s precisely with big amounts of data that being efficient handling the…

  • Tutorial: Semantic Clustering of User Messages with LLM Prompts

    Tutorial: Semantic Clustering of User Messages with LLM Prompts As a Developer Advocate, it’s challenging to keep up with user forum messages and understand the big picture of what users are saying. There’s plenty of valuable content — but how can you quickly spot the key conversations? In this tutorial, I’ll show you an AI…

  • Publish Interactive Data Visualizations for Free with Python and Marimo

    Publish Interactive Data Visualizations for Free with Python and Marimo Working in Data Science, it can be hard to share insights from complex datasets using only static figures. All the facets that describe the shape and meaning of interesting data are not always captured in a handful of pre-generated figures. While we have powerful technologies…

  • Method of Moments Estimation with Python Code

    Method of Moments Estimation with Python Code Let’s say you are in a customer care center, and you would like to know the probability distribution of the number of calls per minute, or in other words, you want to answer the question: what is the probability of receiving zero, one, two, … etc., calls per…

  • Manage Environment Variables with Pydantic

    Manage Environment Variables with Pydantic Introduction Developers work on applications that are supposed to be deployed on some server in order to allow anyone to use those. Typically in the machine where these apps live, developers set up environment variables that allow the app to run. These variables can be API keys of external services,…

  • Triangle Forecasting: Why Traditional Impact Estimates Are Inflated (And How to Fix Them)

    Triangle Forecasting: Why Traditional Impact Estimates Are Inflated (And How to Fix Them) Accurate impact estimations can make or break your business case. Yet, despite its importance, most teams use oversimplified calculations that can lead to inflated projections. These shot-in-the-dark numbers not only destroy credibility with stakeholders but can also result in misallocation of resources and…

  • The Method of Moments Estimator for Gaussian Mixture Models

    The Method of Moments Estimator for Gaussian Mixture Models Audio Processing is one of the most important application domains of digital signal processing (DSP) and machine learning. Modeling acoustic environments is an essential step in developing digital audio processing systems such as: speech recognition, speech enhancement, acoustic echo cancellation, etc. Acoustic environments are filled with background…

  • Efficient Metric Collection in PyTorch: Avoiding the Performance Pitfalls of TorchMetrics

    Efficient Metric Collection in PyTorch: Avoiding the Performance Pitfalls of TorchMetrics Metric collection is an essential part of every machine learning project, enabling us to track model performance and monitor training progress. Ideally, Metrics should be collected and computed without introducing any additional overhead to the training process. However, just like other components of the…

  • Introduction to Minimum Cost Flow Optimization in Python

    Introduction to Minimum Cost Flow Optimization in Python Minimum cost flow optimization minimizes the cost of moving flow through a network of nodes and edges. Nodes include sources (supply) and sinks (demand), with different costs and capacity limits. The aim is to find the least costly way to move volume from sources to sinks while…

  • Awesome Plotly with code series (Part 9): To dot, to slope or to stack?

    Awesome Plotly with code series (Part 9): To dot, to slope or to stack? Simple methods to replace cluttered bar charts with crisp, reader-friendly visuals. Continue reading on Towards Data Science » Jose Parreño Go to original source

  • Stop Creating Bad DAGs — Optimize Your Airflow Environment By Improving Your Python Code

    Stop Creating Bad DAGs — Optimize Your Airflow Environment By Improving Your Python Code Stop Creating Bad DAGs — Optimize Your Airflow Environment By Improving Your Python Code Valuable tips to reduce your DAGs’ parse time and save resources. Photo by Dan Roizer on Unsplash Apache Airflow is one of the most popular orchestration tools in the data field, powering workflows…

  • Your Neural Network Can’t Explain This. TMLE to the Rescue!

    Your Neural Network Can’t Explain This. TMLE to the Rescue! Targeted Maximum Likelihood Estimation (TMLE) helps you explain patterns where other techniques fall short Continue reading on Towards Data Science » Ari Joury, PhD Go to original source

  • The Solar Cycle(s): history, data analysis and trend forecasting.

    The Solar Cycle(s): history, data analysis and trend forecasting. The Solar Cycle(s): History, Data Analysis and Trend Forecasting A brief article on the Solar Cycles, the history behind their observation, data analysis and time series forecasting for the incoming solar maximum in 2025–2026 and the next decades You have probably heard about the 11-year Solar Cycle…

  • Building a Data Dashboard

    Building a Data Dashboard Using the streamlit Python library Continue reading on Towards Data Science » Thomas Reid Go to original source

  • Satellite Image Classification with Deep Learning — Complete Project

    Satellite Image Classification with Deep Learning — Complete Project A Comprehensive Guide Using PyTorch and CNNs Continue reading on Towards Data Science » Leo Anello Go to original source

  • Hands-On Delivery Routes Optimization (TSP) with AI, Using LKH and Python

    Hands-On Delivery Routes Optimization (TSP) with AI, Using LKH and Python Here’s how to optimize the delivery routes, from theory to code. Continue reading on Towards Data Science » Piero Paialunga Go to original source

  • How To: Forecast Time Series Using Lags

    How To: Forecast Time Series Using Lags Lag columns can significantly boost your model’s performance Continue reading on Towards Data Science » Haden Pelletier Go to original source

  • Using Constraint Programming to Solve Math Theorems

    Using Constraint Programming to Solve Math Theorems Case study: the quasigroups existence problem TLDR Some mathematical theorems can be solved by combinatorial exploration. In this article, we focus on the problem of the existence of some quasigroups. We will demonstrate the existence or non existence of some quasigroups using NuCS. NuCs is a fast constraint…

  • What is MicroPython? Do I Need to Know it as a Data Scientist?

    What is MicroPython? Do I Need to Know it as a Data Scientist? In this year’s edition of the Stack Overflow survey, MicroPython is with 1.6% in the Most Popular Technologies — but why? Continue reading on Towards Data Science » Sarah Lea Go to original source

  • LightGBM: The Fastest Option of Gradient Boosting

    LightGBM: The Fastest Option of Gradient Boosting Learn how to implement a fast and effective Gradient Boosting model using Python Continue reading on Towards Data Science » Gustavo R Santos Go to original source

  • Building Visual Agents that can Navigate the Web Autonomously

    Building Visual Agents that can Navigate the Web Autonomously A step-by-step guide to creating visual agents that can navigate the web autonomously Continue reading on Towards Data Science » Luís Roque Go to original source

  • 3 Powerful Examples of the Python Re Library

    3 Powerful Examples of the Python Re Library Explore the power of regex and save time in data analysis Continue reading on Towards Data Science » Suraj Gurav Go to original source

  • Sentiment Analysis with Transformers: A Complete Deep Learning Project — PT. I

    Sentiment Analysis with Transformers: A Complete Deep Learning Project — PT. I Master Fine-Tuning Transformers, Comparing Deep Learning Architectures, and Deploying Sentiment Analysis Models Continue reading on Towards Data Science » Leo Anello Go to original source

  • How to Run Jupyter Notebooks and Generate HTML Reports with Python Scripts

    How to Run Jupyter Notebooks and Generate HTML Reports with Python Scripts A step-by-step guide to automating Jupyter Notebook execution and report generation using Python Continue reading on Towards Data Science » Amanda Iglesias Moreno Go to original source

  • Predicting a Ball Trajectory

    Predicting a Ball Trajectory Polynomial Fit in Python with NumPy Continue reading on Towards Data Science » Florian Trautweiler Go to original source

  • Transforming Data into Solutions: Building a Smart App with Python and AI

    Transforming Data into Solutions: Building a Smart App with Python and AI Some financial analysts worry that artificial intelligence may not justify the massive investments being made in the field. While I understand their concerns, I see things differently. I’m neither an AI Boomer nor an AI Doomer — I believe AI has the potential to drive…

  • Creating SMOTE Oversampling from Scratch

    Creating SMOTE Oversampling from Scratch A Python tutorial on how to implement oversampling and how to make custom variations Continue reading on Towards Data Science » Hari Devanathan Go to original source

  • Introducing n-Step Temporal-Difference Methods

    Introducing n-Step Temporal-Difference Methods Dissecting “Reinforcement Learning” by Richard S. Sutton with custom Python implementations, Episode V Continue reading on Towards Data Science » Oliver S Go to original source

  • Deep Dive into Multithreading, Multiprocessing, and Asyncio

    Deep Dive into Multithreading, Multiprocessing, and Asyncio How to choose the right concurrency model Image by Paul Esch-Laurent from Unsplash Python provides three main approaches to handle multiple tasks simultaneously: multithreading, multiprocessing, and asyncio. Choosing the right model is crucial for maximising your program’s performance and efficiently using system resources. (P.S. It is also a common interview…

  • Master Bots Before Starting with AI Agents: Simple Steps to Create a Mastodon Bot with Python

    Master Bots Before Starting with AI Agents: Simple Steps to Create a Mastodon Bot with Python I recently published a post on Mastodon that was shared by six other accounts within two minutes. Curious, I visited the profiles and… Continue reading on Towards Data Science » Sarah Lea Go to original source

  • Unlocking the Untapped Potential of Retrieval-Augmented Generation (RAG) Pipelines

    Unlocking the Untapped Potential of Retrieval-Augmented Generation (RAG) Pipelines Essential Metrics and Methods to Enhance Performance Across Retrieval, Generation, and End-to-End Pipelines Continue reading on Towards Data Science » Saleh Alkhalifa Go to original source

  • Understanding When and How to Implement FastAPI Middleware (Examples and Use Cases)

    Understanding When and How to Implement FastAPI Middleware (Examples and Use Cases) Supercharge Your FastAPI with Middleware: Practical Use Cases and Examples Continue reading on Towards Data Science » Mike Huls Go to original source

  • Three Important Pandas Functions You Need to Know

    Three Important Pandas Functions You Need to Know Master these techniques to stand out as a Python developer Continue reading on Towards Data Science » Jiayan Yin Go to original source

  • I’m Doing the Advent of Code 2024 in Python — Day 4

    I’m Doing the Advent of Code 2024 in Python — Day 4 Let’s see how many stars we’ll collect. Continue reading on Towards Data Science » Soner Yıldırım Go to original source

  • Design Patterns with Python for Machine Learning Engineers: Template Method

    Design Patterns with Python for Machine Learning Engineers: Template Method Learn how to use the Template design pattern to enhance your code Continue reading on Towards Data Science » Marcello Politi Go to original source

  • How to Tackle an Optimization Problem with Constraint Programming

    How to Tackle an Optimization Problem with Constraint Programming Case study: the travelling salesman problem TLDR Constraint Programming is a technique of choice for solving a Constraint Satisfaction Problem. In this article, we will see that it is also well suited to small to medium optimization problems. Using the well-known travelling salesman problem (TSP) as an…

  • Why Sets Are So Useful in Programming

    Why Sets Are So Useful in Programming And how you can use them to boost your code performance A set is a simple structure defined as a collection of distinct elements. Sets are most commonly seen in fields like mathematics or logic, but they’re also useful in programming for writing efficient code. In this article,…

  • Creating a WhatsApp AI Agent with GPT-4o

    Creating a WhatsApp AI Agent with GPT-4o Created with DALL-E How to use the Meta API to build your own LLM-powered Whatsapp chatbot A game-changer in the field of AI and business management is the integration of AI agents with widely used communication tools. Think of having a familiar chat interface with real-time data requests, updates, and…

  • What Every Aspiring Machine Learning Engineer Must Know to Succeed

    What Every Aspiring Machine Learning Engineer Must Know to Succeed Your Guide to Avoiding Critical Errors with Machine Learning in Production Continue reading on Towards Data Science » Claudia Ng Go to original source

  • Propensity-Score Matching Is the Bedrock of Causal Inference

    Propensity-Score Matching Is the Bedrock of Causal Inference And how to get started with it using Python Continue reading on Towards Data Science » Ari Joury, PhD Go to original source

  • Should you switch from VSCode to Cursor?

    Should you switch from VSCode to Cursor? My experience using VSCode (GitHub Copilot) and Cursor (Claude 3.5 Sonnet) as a Data Scientist. Continue reading on Towards Data Science » Marc Matterson Go to original source

  • Transform Customer Feedback into Actionable Insights with CrewAI and Streamlit

    Transform Customer Feedback into Actionable Insights with CrewAI and Streamlit Build an AI-powered app to analyze unstructured feedback, generate insightful reports, and create interactive visualizations Continue reading on Towards Data Science » Alan Jones Go to original source

  • The Algorithm That Made Google Google

    The Algorithm That Made Google Google How PageRank transformed how we searched the internet, and why it’s still playing an important role in LLMs with Graph RAG. Continue reading on Towards Data Science » Cristian Leo Go to original source

  • Master Machine Learning: 4 Classification Models Made Simple

    Master Machine Learning: 4 Classification Models Made Simple A Beginner’s Guide to Building Models in 15 Practical Steps Continue reading on Towards Data Science » Leo Anello Go to original source

  • Agentic AI: Building Autonomous Systems from Scratch

    Agentic AI: Building Autonomous Systems from Scratch A Step-by-Step Guide to Creating Multi-Agent Frameworks in the Age of Generative AI Continue reading on Towards Data Science » Luís Roque Go to original source

  • CV VideoPlayer — Once and For All

    CV VideoPlayer — Once and For All CV VideoPlayer — Once and For All A Python video player package made for computer vision research Image by author When developing computer vision algorithms, the journey from concept to working implementation often involves countless iterations of watching, analyzing, and debugging video frames. As I dove deeper into computer vision projects, I found myself repeatedly…

  • Sentiment analysis template: A complete data science project

    Sentiment analysis template: A complete data science project 10 essential steps, from data exploration to model deployment. Continue reading on Towards Data Science » Leo Anello Go to original source

  • How to Evaluate Multilingual LLMs With Global-MMLU

    How to Evaluate Multilingual LLMs With Global-MMLU Evaluation of language-specific LLM accuracy on the global Massive Multitask Language Understanding benchmark in Python Continue reading on Towards Data Science » Dr. Leon Eversberg Go to original source

  • I’m Doing the Advent of Code 2024 in Python — Day 1

    I’m Doing the Advent of Code 2024 in Python — Day 1 Let’s see how many stars we’ll collect. Continue reading on Towards Data Science » Soner Yıldırım Go to original source

  • Multimodal RAG: Process Any File Type with AI

    Multimodal RAG: Process Any File Type with AI A beginner-friendly guide with example (Python) code This is the third article in a larger series on multimodal AI. In the previous posts, we discussed multimodal LLMs and embedding models, respectively. In this article, we will combine these ideas to enable the development of multimodal RAG systems. I’ll…

  • Who Does What in Data? A Practical Introduction to the Role of a Data Engineer & Data Scientist

    Who Does What in Data? A Practical Introduction to the Role of a Data Engineer & Data Scientist What does a data engineer do differently to a data scientist? Continue reading on Towards Data Science » Sarah Lea Go to original source

  • Dunder Methods: The Hidden Gems of Python

    Dunder Methods: The Hidden Gems of Python Real-world examples on how actively using special methods can simplify coding and improve readability. Dunder methods, though possibly a basic topic in Python, are something I have often noticed being understood only superficially, even by people who have been coding for quite some time. Disclaimer: This is a forgivable…