Category: large-language-models

LangGraph 101: Let’s Build A Deep Research Agent

LangGraph 101: Let’s Build A Deep Research Agent Learn LangGraph fundamentals from Google’s open-source full-stack implementation The post LangGraph 101: Let’s Build A Deep Research Agent appeared first on Towards Data Science. Shuai Guo Go to original source

August 15, 2025
How to Use LLMs for Powerful Automatic Evaluations

How to Use LLMs for Powerful Automatic Evaluations A beginner-friendly introduction to LLM-as-a-Judge The post How to Use LLMs for Powerful Automatic Evaluations appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

August 14, 2025
Coconut: A Framework for Latent Reasoning in LLMs

Coconut: A Framework for Latent Reasoning in LLMs Explaining Coconut (Training Large Language Models to Reason in a Continuous Latent Space) in simple terms The post Coconut: A Framework for Latent Reasoning in LLMs appeared first on Towards Data Science. Youssef Farag Go to original source

August 13, 2025
Introducing Google’s LangExtract tool

Introducing Google’s LangExtract tool Do RAG without doing RAG with this powerful new NLP and data extraction library The post Introducing Google’s LangExtract tool appeared first on Towards Data Science. Thomas Reid Go to original source

August 12, 2025
Generating Structured Outputs from LLMs

Generating Structured Outputs from LLMs An overview of popular techniques to confine LLMs’ output to a predefined schema The post Generating Structured Outputs from LLMs appeared first on Towards Data Science. Ibrahim Habib Go to original source

August 9, 2025
Agentic AI: On Evaluations

Agentic AI: On Evaluations Metrics to track for RAG and agents, plus the frameworks that help The post Agentic AI: On Evaluations appeared first on Towards Data Science. Ida Silfverskiöld Go to original source

August 8, 2025
Context Engineering — A Comprehensive Hands-On Tutorial with DSPy

Context Engineering — A Comprehensive Hands-On Tutorial with DSPy Let’s dissect the art and science of context engineering, one module at a time! The post Context Engineering — A Comprehensive Hands-On Tutorial with DSPy appeared first on Towards Data Science. Avishek Biswas Go to original source

August 6, 2025
How a Research Lab Made Entirely of LLM Agents Developed Molecules That Can Block a Virus

How a Research Lab Made Entirely of LLM Agents Developed Molecules That Can Block a Virus Welcome to the 21st century by the hand of large language models and reasoning AI agents The post How a Research Lab Made Entirely of LLM Agents Developed Molecules That Can Block a Virus appeared first on Towards Data…

August 6, 2025
LLMs and Mental Health

LLMs and Mental Health Are LLMs good or bad for our mental health? It’s more complicated than that. The post LLMs and Mental Health appeared first on Towards Data Science. Stephanie Kirmer Go to original source

August 1, 2025
How to Evaluate Graph Retrieval in MCP Agentic Systems

How to Evaluate Graph Retrieval in MCP Agentic Systems A framework for measuring retrieval quality in Model Context Protocol agents. The post How to Evaluate Graph Retrieval in MCP Agentic Systems appeared first on Towards Data Science. Tomaz Bratanic Go to original source

July 30, 2025
Talk to my Agent

Talk to my Agent The exciting new world of designing conversation driven APIs for LLMs. The post Talk to my Agent appeared first on Towards Data Science. Roni Dover Go to original source

July 29, 2025
Declarative and Imperative Prompt Engineering for Generative AI

Declarative and Imperative Prompt Engineering for Generative AI Conceptual overview and practical considerations The post Declarative and Imperative Prompt Engineering for Generative AI appeared first on Towards Data Science. Chinmay Kakatkar Go to original source

July 26, 2025
How To Significantly Enhance LLMs by Leveraging Context Engineering

How To Significantly Enhance LLMs by Leveraging Context Engineering The benefits and practical aspects of context engineering for LLMs The post How To Significantly Enhance LLMs by Leveraging Context Engineering appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

July 22, 2025
MCP Client Development with Streamlit: Build Your AI-Powered Web App

MCP Client Development with Streamlit: Build Your AI-Powered Web App MCP client development with Streamlit to enhance the tool calling capabilities of remote MCP servers, from setting up your development environment and securing API keys, handling user input, connecting to remote MCP servers, and displaying AI-generated responses. The post MCP Client Development with Streamlit: Build…

July 22, 2025
Advanced Topic Modeling with LLMs

Advanced Topic Modeling with LLMs A deep dive into topic modeling by leveraging representation models and generative AI with BERTopic The post Advanced Topic Modeling with LLMs appeared first on Towards Data Science. Alex Davis Go to original source

July 22, 2025
The Age of Self-Evolving AI Is Here

The Age of Self-Evolving AI Is Here How Meta’s latest breakthrough lets models learn, adapt, and improve — all on their own The post The Age of Self-Evolving AI Is Here appeared first on Towards Data Science. Moulik Gupta Go to original source

July 18, 2025
Your 1M+ Context Window LLM Is Less Powerful Than You Think

Your 1M+ Context Window LLM Is Less Powerful Than You Think Why working memory is a more important bottleneck than raw context window size The post Your 1M+ Context Window LLM Is Less Powerful Than You Think appeared first on Towards Data Science. Tobias Schnabel Go to original source

July 18, 2025
Exploring Prompt Learning: Using English Feedback to Optimize LLM Systems

Exploring Prompt Learning: Using English Feedback to Optimize LLM Systems Prompt learning presents a compelling approach for continuous improvement of AI applications The post Exploring Prompt Learning: Using English Feedback to Optimize LLM Systems appeared first on Towards Data Science. Aparna Dhinakaran Go to original source

July 17, 2025
The Power of Building from Scratch

The Power of Building from Scratch Mauro Di Pietro discusses building AI agents with open-source tools, bridging theory and practice, and why he’s still nostalgic for scikit-learn. The post The Power of Building from Scratch appeared first on Towards Data Science. TDS Editors Go to original source

July 17, 2025
How to Ensure Reliability in LLM Applications

How to Ensure Reliability in LLM Applications Learn how to make your LLM applications more robust The post How to Ensure Reliability in LLM Applications appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

July 16, 2025
From Equal Weights to Smart Weights: OTPO’s Approach to Better LLM Alignment

From Equal Weights to Smart Weights: OTPO’s Approach to Better LLM Alignment Using optimal transport to weight what matters most In LLM-generated responses The post From Equal Weights to Smart Weights: OTPO’s Approach to Better LLM Alignment appeared first on Towards Data Science. Sudheer Singh Go to original source

July 16, 2025
Topic Model Labelling with LLMs

Topic Model Labelling with LLMs Python tutorial for reproducible labeling of cutting-edge topic models with GPT4-o-mini. The post Topic Model Labelling with LLMs appeared first on Towards Data Science. Petr Koráb Go to original source

July 15, 2025
Are You Being Unfair to LLMs?

Are You Being Unfair to LLMs? They may deserve better. The post Are You Being Unfair to LLMs? appeared first on Towards Data Science. Julian Mendel Go to original source

July 12, 2025
Hitchhiker’s Guide to RAG: From Tiny Files to Tolstoy with OpenAI’s API and LangChain

Hitchhiker’s Guide to RAG: From Tiny Files to Tolstoy with OpenAI’s API and LangChain Scaling a simple RAG pipeline from simple notes to full books The post Hitchhiker’s Guide to RAG: From Tiny Files to Tolstoy with OpenAI’s API and LangChain appeared first on Towards Data Science. Maria Mouschoutzi Go to original source

July 12, 2025
Evaluation-Driven Development for LLM-Powered Products: Lessons from Building in Healthcare

Evaluation-Driven Development for LLM-Powered Products: Lessons from Building in Healthcare How metrics and monitoring combine with human expertise to build trustworthy AI in healthcare. The post Evaluation-Driven Development for LLM-Powered Products: Lessons from Building in Healthcare appeared first on Towards Data Science. Robert Martin-Short Go to original source

July 11, 2025
Work Data Is the Next Frontier for GenAI

Work Data Is the Next Frontier for GenAI 9 reasons why work data is the single most valuable data source for LLM training, uniquely capable of propelling LLM performance to unprecedented heights. The post Work Data Is the Next Frontier for GenAI appeared first on Towards Data Science. Zsombor Varnagy-Toth Go to original source

July 10, 2025
Recap of all types of LLM Agents

Recap of all types of LLM Agents Regular, ReAct, Chain-of-Thought, Reflexion, ToT, GoT, PoT The post Recap of all types of LLM Agents appeared first on Towards Data Science. Mauro Di Pietro Go to original source

July 10, 2025
How to Fine-Tune Small Language Models to Think with Reinforcement Learning

How to Fine-Tune Small Language Models to Think with Reinforcement Learning A visual tour and from-scratch guide to train GRPO reasoning models in PyTorch The post How to Fine-Tune Small Language Models to Think with Reinforcement Learning appeared first on Towards Data Science. Avishek Biswas Go to original source

July 9, 2025
Fairness Pruning: Precision Surgery to Reduce Bias in LLMs

Fairness Pruning: Precision Surgery to Reduce Bias in LLMs From unjustified shootings to neutral stories: how to fix toxic narratives with selective pruning The post Fairness Pruning: Precision Surgery to Reduce Bias in LLMs appeared first on Towards Data Science. Pere Martra Go to original source

July 4, 2025
Software Engineering in the LLM Era

Software Engineering in the LLM Era On growing new software engineers, even when it’s inefficient The post Software Engineering in the LLM Era appeared first on Towards Data Science. Stephanie Kirmer Go to original source

July 3, 2025
From Pixels to Plots

From Pixels to Plots How I built an AI-powered prototype to turn images into insights The post From Pixels to Plots appeared first on Towards Data Science. Jens Winkelmann Go to original source

July 1, 2025
Become a Better Data Scientist with These Prompt Engineering Tips and Tricks

Become a Better Data Scientist with These Prompt Engineering Tips and Tricks Part 1: prompt engineering for planning, cleaning, and EDA The post Become a Better Data Scientist with These Prompt Engineering Tips and Tricks appeared first on Towards Data Science. Sara Nobrega Go to original source

July 1, 2025
A Developer’s Guide to Building Scalable AI: Workflows vs Agents

A Developer’s Guide to Building Scalable AI: Workflows vs Agents A practical guide to choosing between AI agents and workflows for production systems, covering the hidden costs, architectural trade-offs, and decision framework that can save you thousands in deployment mistakes. Includes real-world examples and a scoring system to determine which approach fits your specific use…

June 28, 2025
Use OpenAI Whisper for Automated Transcriptions

Use OpenAI Whisper for Automated Transcriptions Streamline your computer interactions using OpenAI’s Whisper model The post Use OpenAI Whisper for Automated Transcriptions appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

June 26, 2025
How to Train a Chatbot Using RAG and Custom Data

How to Train a Chatbot Using RAG and Custom Data Retrieval-Augmented Generation made easy with Llama The post How to Train a Chatbot Using RAG and Custom Data appeared first on Towards Data Science. Haden Pelletier Go to original source

June 26, 2025
Why Your Next LLM Might Not Have A Tokenizer

Why Your Next LLM Might Not Have A Tokenizer The Tokenizer Has Been a Necessary Evil, but This Radical Approach Shows That It Might Not Be Necessary Anymore. The post Why Your Next LLM Might Not Have A Tokenizer appeared first on Towards Data Science. Moulik Gupta Go to original source

June 25, 2025
Reinforcement Learning from Human Feedback, Explained Simply

Reinforcement Learning from Human Feedback, Explained Simply The one technique that made ChatGPT so smart The post Reinforcement Learning from Human Feedback, Explained Simply appeared first on Towards Data Science. Vyacheslav Efimov Go to original source

June 24, 2025
LLM-as-a-Judge: A Practical Guide

LLM-as-a-Judge: A Practical Guide How to Scale LLM Evaluations Beyond Manual Review The post LLM-as-a-Judge: A Practical Guide appeared first on Towards Data Science. Shuai Guo Go to original source

June 20, 2025
Beyond Code Generation: Continuously Evolve Text with LLMs

Beyond Code Generation: Continuously Evolve Text with LLMs Long-running content evolution and an introduction to result analysis The post Beyond Code Generation: Continuously Evolve Text with LLMs appeared first on Towards Data Science. Julian Mendel Go to original source

June 19, 2025
AI Is Not a Black Box (Relatively Speaking)

AI Is Not a Black Box (Relatively Speaking) Compared to the opacity around human intelligence, AI is more transparent in some very tangible ways. The post AI Is Not a Black Box (Relatively Speaking) appeared first on Towards Data Science. Piotr (Peter) Mardziel Go to original source

June 14, 2025
Connecting the Dots for Better Movie Recommendations

Connecting the Dots for Better Movie Recommendations Connecting the Dots for Better Movie Recommendations: Lightweight graph RAG on Rotten Tomatoes movie reviews The post Connecting the Dots for Better Movie Recommendations appeared first on Towards Data Science. Brian Godsey Go to original source

June 13, 2025
Design Smarter Prompts and Boost Your LLM Output: Real Tricks from an AI Engineer’s Toolbox

Design Smarter Prompts and Boost Your LLM Output: Real Tricks from an AI Engineer’s Toolbox Not just what you ask, but how you ask it. Practical techniques for prompt engineering that deliver The post Design Smarter Prompts and Boost Your LLM Output: Real Tricks from an AI Engineer’s Toolbox appeared first on Towards Data Science. Ugo Pradère…

June 13, 2025
Model Context Protocol (MCP) Tutorial: Build Your First MCP Server in 6 Steps

Model Context Protocol (MCP) Tutorial: Build Your First MCP Server in 6 Steps A beginner-friendly tutorial of MCP architecture, with the focus on MCP server components and applications, guiding through the process of building a custom MCP server that enables code-to-diagram. The post Model Context Protocol (MCP) Tutorial: Build Your First MCP Server in 6…

June 12, 2025
LLMs + Pandas: How I Use Generative AI to Generate Pandas DataFrame Summaries

LLMs + Pandas: How I Use Generative AI to Generate Pandas DataFrame Summaries Local Large Language Models can convert massive DataFrames to presentable Markdown reports — here’s how. The post LLMs + Pandas: How I Use Generative AI to Generate Pandas DataFrame Summaries appeared first on Towards Data Science. Dario Radečić Go to original source

June 3, 2025
Evaluating LLMs for Inference, or Lessons from Teaching for Machine Learning

Evaluating LLMs for Inference, or Lessons from Teaching for Machine Learning It’s like grading papers, but your student is an LLM The post Evaluating LLMs for Inference, or Lessons from Teaching for Machine Learning appeared first on Towards Data Science. Stephanie Kirmer Go to original source

June 3, 2025
Agentic RAG Applications: Company Knowledge Slack Agents

Agentic RAG Applications: Company Knowledge Slack Agents Lessons learnt using LlamaIndex and Modal The post Agentic RAG Applications: Company Knowledge Slack Agents appeared first on Towards Data Science. Ida Silfverskiöld Go to original source

May 31, 2025
LLM Optimization: LoRA and QLoRA

LLM Optimization: LoRA and QLoRA Scalable fine-tuning techniques for large language models The post LLM Optimization: LoRA and QLoRA appeared first on Towards Data Science. Vyacheslav Efimov Go to original source

May 31, 2025
The Hidden Security Risks of LLMs

The Hidden Security Risks of LLMs And why self-hosting might be the safer bet The post The Hidden Security Risks of LLMs appeared first on Towards Data Science. Anouk Dutrée Go to original source

May 30, 2025
Tree of Thought Prompting: Teaching LLMs to Think Slowly

Tree of Thought Prompting: Teaching LLMs to Think Slowly Playing Minesweeper with Augmented Reasoning The post Tree of Thought Prompting: Teaching LLMs to Think Slowly appeared first on Towards Data Science. Shuyang Go to original source

May 29, 2025
New to LLMs? Start Here

New to LLMs? Start Here A guide to Agents, LLMs, RAG, Fine-tuning, LangChain with practical examples to start building The post New to LLMs? Start Here appeared first on Towards Data Science. ALESSANDRA COSTA Go to original source

May 24, 2025
What the Most Detailed Peer-Reviewed Study on AI in the Classroom Taught Us

What the Most Detailed Peer-Reviewed Study on AI in the Classroom Taught Us The rapid proliferation and superb capabilities of widely available LLMs has ignited intense debate within the educational sector. On one side they offer students a 24/7 tutor who is always available to help; but then of course students can use LLMs to…

May 21, 2025
Boost 2-Bit LLM Accuracy with EoRA

Boost 2-Bit LLM Accuracy with EoRA Quantization is one of the key techniques for reducing the memory footprint of large language models (LLMs). It works by converting the data type of model parameters from higher-precision formats such as 32-bit floating point (FP32) or 16-bit floating point (FP16/BF16) to lower-precision integer formats, typically INT8 or INT4.…

May 15, 2025
Empowering LLMs to Think Deeper by Erasing Thoughts

Empowering LLMs to Think Deeper by Erasing Thoughts Introduction Recent large language models (LLMs) — such as OpenAI’s o1/o3, DeepSeek’s R1 and Anthropic’s Claude 3.7 — demonstrate that allowing the model to think deeper and longer at test time can significantly enhance model’s reasoning capability. The core approach underlying their deep thinking capability is called…

May 13, 2025
How I Finally Understood MCP — and Got It Working in Real Life

How I Finally Understood MCP — and Got It Working in Real Life Table of Content Introduction: Why I Wrote This The Evolution of Tool Integration with LLMs What Is Model Context Protocol (MCP), Really? Wait, MCP sounds like RAG… but is it? In an MCP-based setup In a traditional RAG system Traditional RAG Implementation MCP Implementation…

May 13, 2025
What My GPT Stylist Taught Me About Prompting Better

What My GPT Stylist Taught Me About Prompting Better When I built a GPT-powered fashion assistant, I expected runway looks—not memory loss, hallucinations, or semantic déjà vu. But what unfolded became a lesson in how prompting really works—and why LLMs are more like wild animals than tools. This article builds on my previous article on…

May 10, 2025
How Not to Write an MCP Server

How Not to Write an MCP Server I recently had the chance to create an MCP server for an observability application in order to provide the AI agent with dynamic code analysis capabilities. Because of its potential to transform applications, MCP is a technology I’m even more ecstatic about than I originally was about genAI…

May 10, 2025
Retrieval Augmented Classification: Improving Text Classification with External Knowledge

Retrieval Augmented Classification: Improving Text Classification with External Knowledge Text Classification stands as one of the most basic yet most important applications of natural language processing. It has a vital role in many real-world applications that go from filtering unwanted emails like spam, detecting product categories or classifying user intent in a chat-bot application. The…

May 7, 2025
Build and Query Knowledge Graphs with LLMs

Build and Query Knowledge Graphs with LLMs Knowledge Graphs are relevant A Knowledge Graph could be defined as a structured representation of information that connects concepts, entities, and their relationships in a way that mimics human understanding. It is often used to organise and integrate data from various sources, enabling machines to reason, infer, and retrieve relevant…

May 3, 2025
Attaining LLM Certainty with AI Decision Circuits

Attaining LLM Certainty with AI Decision Circuits The promise of AI agents has taken the world by storm. Agents can interact with the world around them, write articles (not this one though), take actions on your behalf, and generally make the difficult part of automating any task easy and approachable. Agents take aim at the most…

May 3, 2025
Step-by-Step Guide to Build and Deploy an LLM-Powered Chat with Memory in Streamlit

Step-by-Step Guide to Build and Deploy an LLM-Powered Chat with Memory in Streamlit In this post, I’ll show you step by step how to build and deploy a chat powered with LLM — Gemini — in Streamlit and monitor the API usage on Google Cloud Console. Streamlit is a Python framework that makes it super easy to turn your…

May 2, 2025
From FOMO to Opportunity: Analytical AI in the Era of LLM Agents

From FOMO to Opportunity: Analytical AI in the Era of LLM Agents Are you feeling “fear of missing out” (FOMO) when it comes to LLM agents? Well, that was the case for me for quite a while. In recent months, it feels like my online feeds have been completely bombarded by “LLM Agents”: every other…

April 30, 2025
Building a Scalable and Accurate Audio Interview Transcription Pipeline with Google Gemini

Building a Scalable and Accurate Audio Interview Transcription Pipeline with Google Gemini This article is co-authored by Ugo Pradère and David Haüet How hard can it be to transcribe an interview? You feed the audio to an AI model, wait a few minutes, and boom: perfect transcript, right? Well… not quite. When it comes to…

April 30, 2025
How to Level Up Your Technical Skills in This AI Era

How to Level Up Your Technical Skills in This AI Era AI-assisted coding is here to stay. Tools like Cursor, V0, and Lovable have dramatically lowered the barrier to entry — building dashboards, pipelines, or entire apps can now be done in a fraction of the time. I use these tools daily, and they’ve definitely made me…

April 30, 2025
A Step-By-Step Guide To Powering Your Application With LLMs

A Step-By-Step Guide To Powering Your Application With LLMs You might be wondering whether GenAI is just hype or external noise. I also thought this was hype, and I could sit this one out until the dust cleared. Oh, boy, was I wrong. GenAI has real-world applications. It also generates revenue for companies, so we expect…

April 26, 2025
Behind the Magic: How Tensors Drive Transformers

Behind the Magic: How Tensors Drive Transformers Introduction Transformers have changed the way artificial intelligence works, especially in understanding language and learning from data. At the core of these models are tensors (a generalized type of mathematical matrices that help process information) . As data moves through the different parts of a Transformer, these tensors…

April 26, 2025
How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals

How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals The recent launch of the DeepSeek-R1 model sent ripples across the global AI community. It delivered breakthroughs on par with the reasoning models from Meta and OpenAI, achieving this in a fraction of the time and at a significantly lower cost. Beyond…

April 24, 2025
Retrieval Augmented Generation (RAG) — An Introduction

Retrieval Augmented Generation (RAG) — An Introduction The model hallucinated! It was giving me OK answers and then it just started hallucinating. We’ve all heard or experienced it. Natural Language Generation models can sometimes hallucinate, i.e., they start generating text that is not quite accurate for the prompt provided. In layman’s terms, they start making…

April 22, 2025
Beyond the Code: Unconventional Lessons from Empathetic Interviewing

Beyond the Code: Unconventional Lessons from Empathetic Interviewing Recently, I’ve been interviewing Computer Science students applying for data science and engineering internships with a 4-day turnaround from CV vetting to final decisions. With a small local office of 10 and no in-house HR, hiring managers handle the entire process. This article reflects on the lessons…

April 22, 2025
Load-Testing LLMs Using LLMPerf

Load-Testing LLMs Using LLMPerf Deploying your Large Language Model (LLM) is not necessarily the final step in productionizing your Generative AI application. An often forgotten, yet crucial part of the MLOPs lifecycle is properly load testing your LLM and ensuring it is ready to withstand your expected production traffic. Load testing at a high level…

April 19, 2025
The Good-Enough Truth

The Good-Enough Truth Could Shopify be right in requiring teams to demonstrate why AI can’t do a job before approving new human hires? Will companies that prioritize AI solutions eventually evolve into AI entities with significantly fewer employees? These are open-ended questions that have puzzled me about where such transformations might leave us in our quest for…

April 18, 2025
An Unbiased Review of Snowflake’s Document AI

An Unbiased Review of Snowflake’s Document AI As data professionals, we’re comfortable with tabular data… Tabular data. Image by Author. We can also handle words, json, xml feeds, and pictures of cats. But what about a cardboard box full of things like this? (Image by Annie Spratt, Unsplash) The info on this receipt wants so…

April 16, 2025
Kernel Case Study: Flash Attention

Kernel Case Study: Flash Attention The attention mechanism is at the core of modern day transformers. But scaling the context window of these transformers was a major challenge, and it still is even though we are in the era of a million tokens + context window (Qwen 2.5 [1]). There are both considerable compute and memory…

April 4, 2025
Agentic GraphRAG for Commercial Contracts

Agentic GraphRAG for Commercial Contracts In every business, legal contracts are foundational documents that define the relationships, obligations, and responsibilities between parties. Whether it’s a partnership agreement, an NDA, or a supplier contract, these documents often contain critical information that drives decision-making, risk management, and compliance. However, navigating and extracting insights from these contracts can…

April 3, 2025
Talk to Videos

Talk to Videos Large language models (LLMs) are improving in efficiency and are now able to understand different data formats, offering possibilities for myriads of applications in different domains. Initially, LLMs were inherently able to process only text. The image understanding feature was integrated by coupling an LLM with another image encoding model. However, gpt-4o…

March 28, 2025
Testing the Power of Multimodal AI Systems in Reading and Interpreting Photographs, Maps, Charts and More

Testing the Power of Multimodal AI Systems in Reading and Interpreting Photographs, Maps, Charts and More Introduction It’s no news that artificial intelligence has made huge strides in recent years, particularly with the advent of multimodal models that can process and create both text and images, and some very new ones that also process and produce…

March 26, 2025
Build Your Own AI Coding Assistant in JupyterLab with Ollama and Hugging Face

Build Your Own AI Coding Assistant in JupyterLab with Ollama and Hugging Face Jupyter AI brings generative AI capabilities right into the Jupyter interface. Having a local AI assistant ensures privacy, reduces latency, and provides offline functionality, making it a powerful tool for developers. In this article, we’ll learn how to set up a local…

March 25, 2025
R.E.D.: Scaling Text Classification with Expert Delegation

R.E.D.: Scaling Text Classification with Expert Delegation With the new age of problem-solving augmented by Large Language Models (LLMs), only a handful of problems remain that have subpar solutions. Most classification problems (at a PoC level) can be solved by leveraging LLMs at 70–90% Precision/F1 with just good prompt engineering techniques, as well as adaptive…

March 21, 2025
Mastering Prompt Engineering with Functional Testing: A Systematic Guide to Reliable LLM Outputs

Mastering Prompt Engineering with Functional Testing: A Systematic Guide to Reliable LLM Outputs Creating efficient prompts for large language models often starts as a simple task… but it doesn’t always stay that way. Initially, following basic best practices seems sufficient: adopt the persona of a specialist, write clear instructions, require a specific response format, and…

March 15, 2025
Are You Still Using LoRA to Fine-Tune Your LLM?

Are You Still Using LoRA to Fine-Tune Your LLM? LoRA (Low Rank Adaptation – arxiv.org/abs/2106.09685) is a popular technique for fine-tuning Large Language Models (LLMs) on the cheap. But 2024 has seen an explosion of new parameter-efficient fine-tuning techniques, an alphabet soup of LoRA alternatives: SVF, SVFT, MiLoRA, PiSSA, LoRA-XS … And most are based…

March 14, 2025
Using GPT-4 for Personal Styling

Using GPT-4 for Personal Styling I’ve always been fascinated by Fashion—collecting unique pieces and trying to blend them in my own way. But let’s just say my closet was more of a work-in-progress avalanche than a curated wonderland. Every time I tried to add something new, I risked toppling my carefully balanced piles. Why this…

March 8, 2025
Overcome Failing Document Ingestion & RAG Strategies with Agentic Knowledge Distillation

Overcome Failing Document Ingestion & RAG Strategies with Agentic Knowledge Distillation Introduction Many generative AI use cases still revolve around Retrieval Augmented Generation (RAG), yet consistently fall short of user expectations. Despite the growing body of research on RAG improvements and even adding Agents into the process, many solutions still fail to return exhaustive results,…

March 6, 2025
Generative AI Is Declarative

Generative AI Is Declarative ChatGPT launched in 2022 and kicked off the Generative Ai boom. In the two years since, academics, technologists, and armchair experts have written libraries worth of articles on the technical underpinnings of generative AI and about the potential capabilities of both current and future generative AI models. Surprisingly little has been…

March 6, 2025
How to Train LLMs to “Think” (o1 & DeepSeek-R1)

How to Train LLMs to “Think” (o1 & DeepSeek-R1) In September 2024, OpenAI released its o1 model, trained on large-scale reinforcement learning, giving it “advanced reasoning” capabilities. Unfortunately, the details of how they pulled this off were never shared publicly. Today, however, DeepSeek (an AI research lab) has replicated this reasoning behavior and published the…

March 4, 2025
LLM + RAG: Creating an AI-Powered File Reader Assistant

LLM + RAG: Creating an AI-Powered File Reader Assistant Introduction AI is everywhere. It is hard not to interact at least once a day with a Large Language Model (LLM). The chatbots are here to stay. They’re in your apps, they help you write better, they compose emails, they read emails…well, they do a lot.…

March 4, 2025
Avoidable and Unavoidable Randomness in GPT-4o

Avoidable and Unavoidable Randomness in GPT-4o Of course there is randomness in GPT-4o’s outputs. After all, the model samples from a probability distribution when choosing each token. But what I didn’t understand was that those very probabilities themselves are not deterministic. Even with consistent prompts, fixed seeds, and temperature set to zero, GPT-4o still introduces…

March 4, 2025
Unraveling Large Language Model Hallucinations

Unraveling Large Language Model Hallucinations Introduction In a YouTube video titled Deep Dive into LLMs like ChatGPT, former Senior Director of AI at Tesla, Andrej Karpathy discusses the psychology of Large Language Models (LLMs) as emergent cognitive effects of the training pipeline. This article is inspired by his explanation of LLM hallucinations and the information presented in the…

March 1, 2025
How LLMs Work: Reinforcement Learning, RLHF, DeepSeek R1, OpenAI o1, AlphaGo

How LLMs Work: Reinforcement Learning, RLHF, DeepSeek R1, OpenAI o1, AlphaGo Welcome to part 2 of my LLM deep dive. If you’ve not read Part 1, I highly encourage you to check it out first. Previously, we covered the first two major stages of training an LLM: Pre-training — Learning from massive datasets to form a base…

February 28, 2025
Enhancing RAG: Beyond Vanilla Approaches

Enhancing RAG: Beyond Vanilla Approaches Retrieval-Augmented Generation (RAG) is a powerful technique that enhances language models by incorporating external information retrieval mechanisms. While standard RAG implementations improve response relevance, they often struggle in complex retrieval scenarios. This article explores the limitations of a vanilla RAG setup and introduces advanced techniques to enhance its accuracy and…

February 25, 2025
6 Common LLM Customization Strategies Briefly Explained

6 Common LLM Customization Strategies Briefly Explained Why Customize LLMs? Large Language Models (Llms) are deep learning models pre-trained based on self-supervised learning, requiring a vast amount of resources on training data, training time and holding a large number of parameters. LLM have revolutionized natural language processing especially in the last 2 years, demonstrating remarkable…

February 25, 2025
How to Use an LLM-Powered Boilerplate for Building Your Own Node.js API

How to Use an LLM-Powered Boilerplate for Building Your Own Node.js API For a long time, one of the common ways to start new Node.js projects was using boilerplate templates. These templates help developers reuse familiar code structures and implement standard features, such as access to cloud file storage. With the latest developments in LLM,…

February 21, 2025
Formulation of Feature Circuits with Sparse Autoencoders in LLM

Formulation of Feature Circuits with Sparse Autoencoders in LLM Large Language models (LLMs) have witnessed impressive progress and these large models can do a variety of tasks, from generating human-like text to answering questions. However, understanding how these models work still remains challenging, especially due a phenomenon called superposition where features are mixed into one…

February 20, 2025
How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference

How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference With the recent explosion of interest in large language models (LLMs), they often seem almost magical. But let’s demystify them. I wanted to step back and unpack the fundamentals — breaking down how LLMs are built, trained, and fine-tuned to become the AI systems we interact…

February 19, 2025
Tutorial: Semantic Clustering of User Messages with LLM Prompts

Tutorial: Semantic Clustering of User Messages with LLM Prompts As a Developer Advocate, it’s challenging to keep up with user forum messages and understand the big picture of what users are saying. There’s plenty of valuable content — but how can you quickly spot the key conversations? In this tutorial, I’ll show you an AI…

February 18, 2025
How to Measure the Reliability of a Large Language Model’s Response

How to Measure the Reliability of a Large Language Model’s Response The basic principle of Large Language Models (LLMs) is very simple: to predict the next word (or token) in a sequence of words based on statistical patterns in their training data. However, this seemingly simple capability turns out to be incredibly sophisticated when it…

February 13, 2025
Synthetic Data Generation with LLMs

Synthetic Data Generation with LLMs Popularity of RAG Over the past two years while working with financial firms, I’ve observed firsthand how they identify and prioritize Generative AI use cases, balancing complexity with potential value. Retrieval-Augmented Generation (RAG) often stands out as a foundational capability across many LLM-driven solutions, striking a balance between ease of implementation…

February 8, 2025
Training Large Language Models: From TRPO to GRPO

Training Large Language Models: From TRPO to GRPO Deepseek has recently made quite a buzz in the AI community, thanks to its impressive performance at relatively low costs. I think this is a perfect opportunity to dive deeper into how Large Language Models (LLMs) are trained. In this article, we will focus on the Reinforcement Learning…

February 6, 2025
Supercharge Your RAG with Multi-Agent Self-RAG

Supercharge Your RAG with Multi-Agent Self-RAG Introduction Many of us might have tried to build a RAG application and noticed it falls significantly short of addressing real-life needs. Why is that? It’s because many real-world problems require multiple steps of information retrieval and reasoning. We need our agent to perform those as humans normally do,…

February 6, 2025
From Resume to Cover Letter Using AI and LLM, with Python and Streamlit

From Resume to Cover Letter Using AI and LLM, with Python and Streamlit DISCLAIMER: The idea of doing Cover Letter or even Resume with AI does not obviously start with me. A lot of people have done this before (very successfully) and have built websites and even companies from the idea. This is just a…

February 5, 2025
Beyond Causal Language Modeling

Beyond Causal Language Modeling A deep dive into “Not All Tokens Are What You Need for Pretraining” Introduction A few days ago, I had the chance to present at a local reading group that focused on some of the most exciting and insightful papers from NeurIPS 2024. As a presenter, I selected a paper titled…

January 28, 2025
Large Language Models: A Short Introduction

Large Language Models: A Short Introduction And why you should care about LLMs Image by author. There’s an acronym you’ve probably heard non-stop for the past few years: LLM, which stands for Large Language Model. In this article we’re going to take a brief look at what LLMs are, why they’re an extremely exciting piece of technology, why…

January 22, 2025