Category: llm

From Tokens to Theorems: Building a Neuro-Symbolic AI Mathematician

From Tokens to Theorems: Building a Neuro-Symbolic AI Mathematician The next Gauss may not be born — they may be spun up in the cloud The post From Tokens to Theorems: Building a Neuro-Symbolic AI Mathematician appeared first on Towards Data Science. Sean Moran Go to original source

September 9, 2025
The End-to-End Data Scientist’s Prompt Playbook

The End-to-End Data Scientist’s Prompt Playbook Part 3: Prompts for docs, DevOps, and stakeholder communication The post The End-to-End Data Scientist’s Prompt Playbook appeared first on Towards Data Science. Sara Nobrega Go to original source

September 9, 2025
Preventing Context Overload: Controlled Neo4j MCP Cypher Responses for LLMs

Preventing Context Overload: Controlled Neo4j MCP Cypher Responses for LLMs How timeouts, truncation, and result sanitization keep Cypher outputs LLM-ready The post Preventing Context Overload: Controlled Neo4j MCP Cypher Responses for LLMs appeared first on Towards Data Science. Tomaz Bratanic Go to original source

September 8, 2025
How to Context Engineer to Optimize Question Answering Pipelines

How to Context Engineer to Optimize Question Answering Pipelines Learn how to apply context engineering to enhance your question answering systems. The post How to Context Engineer to Optimize Question Answering Pipelines appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

September 6, 2025
Should We Use LLMs As If They Were Swiss Knives?

Should We Use LLMs As If They Were Swiss Knives? A logic game performance comparison between popular LLMs and a custom-made algorithm The post Should We Use LLMs As If They Were Swiss Knives? appeared first on Towards Data Science. Nicolas Garcia Aramouni Go to original source

September 5, 2025
How to Scale Your AI Search to Handle 10M Queries with 5 Powerful Techniques

How to Scale Your AI Search to Handle 10M Queries with 5 Powerful Techniques Optimize your AI search with RAG, contextual retrieval and evaluations The post How to Scale Your AI Search to Handle 10M Queries with 5 Powerful Techniques appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

September 3, 2025
What is Universality in LLMs? How to Find Universal Neurons

What is Universality in LLMs? How to Find Universal Neurons How independently trained transformers form same the neurons The post What is Universality in LLMs? How to Find Universal Neurons appeared first on Towards Data Science. Shuyang Go to original source

September 3, 2025
How to Develop a Bilingual Voice Assistant

How to Develop a Bilingual Voice Assistant Exploring ways to make voice assistants more personal The post How to Develop a Bilingual Voice Assistant appeared first on Towards Data Science. Deepak Krishnamurthy Go to original source

September 1, 2025
A Brief History of GPT Through Papers

A Brief History of GPT Through Papers Language models are becoming really good. But where did they come from? The post A Brief History of GPT Through Papers appeared first on Towards Data Science. Rohit Pandey Go to original source

August 28, 2025
How to Develop Powerful Internal LLM Benchmarks

How to Develop Powerful Internal LLM Benchmarks Learn how to compare LLMs using your own interal benchmark The post How to Develop Powerful Internal LLM Benchmarks appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

August 27, 2025
Using Google’s LangExtract and Gemma for Structured Data Extraction

Using Google’s LangExtract and Gemma for Structured Data Extraction Extracting structured information effectively and accurately from long unstructured text with LangExtract and LLMs The post Using Google’s LangExtract and Gemma for Structured Data Extraction appeared first on Towards Data Science. Kenneth Leung Go to original source

August 27, 2025
How to Perform Comprehensive Large Scale LLM Validation

How to Perform Comprehensive Large Scale LLM Validation Learn how to validate large scale LLM applications The post How to Perform Comprehensive Large Scale LLM Validation appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

August 22, 2025
How We Reduced LLM Costs by 90% with 5 Lines of Code

How We Reduced LLM Costs by 90% with 5 Lines of Code When clean code hides inefficiencies: what we learned from fixing a few lines of code and saving 90% in LLM cost. The post How We Reduced LLM Costs by 90% with 5 Lines of Code appeared first on Towards Data Science. Uri Peled Go to…

August 22, 2025
“Where’s Marta?”: How We Removed Uncertainty From AI Reasoning

“Where’s Marta?”: How We Removed Uncertainty From AI Reasoning A primer on overcoming LLM limitations with formal verification. The post “Where’s Marta?”: How We Removed Uncertainty From AI Reasoning appeared first on Towards Data Science. Jacopo Tagliabue Go to original source

August 21, 2025
How to Create Powerful LLM Applications with Context Engineering

How to Create Powerful LLM Applications with Context Engineering Improve your LLM by optimizing its context The post How to Create Powerful LLM Applications with Context Engineering appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

August 19, 2025
How to Use LLMs for Powerful Automatic Evaluations

How to Use LLMs for Powerful Automatic Evaluations A beginner-friendly introduction to LLM-as-a-Judge The post How to Use LLMs for Powerful Automatic Evaluations appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

August 14, 2025
Coconut: A Framework for Latent Reasoning in LLMs

Coconut: A Framework for Latent Reasoning in LLMs Explaining Coconut (Training Large Language Models to Reason in a Continuous Latent Space) in simple terms The post Coconut: A Framework for Latent Reasoning in LLMs appeared first on Towards Data Science. Youssef Farag Go to original source

August 13, 2025
Fine-Tune Your Topic Modeling Workflow with BERTopic

Fine-Tune Your Topic Modeling Workflow with BERTopic Learn how to fine-tune BERTopic settings for more focused, reproducible, and interpretable results The post Fine-Tune Your Topic Modeling Workflow with BERTopic appeared first on Towards Data Science. Tiffany Chen Go to original source

August 13, 2025
Generating Structured Outputs from LLMs

Generating Structured Outputs from LLMs An overview of popular techniques to confine LLMs’ output to a predefined schema The post Generating Structured Outputs from LLMs appeared first on Towards Data Science. Ibrahim Habib Go to original source

August 9, 2025
Demystifying Cosine Similarity

Demystifying Cosine Similarity Mathematical intuition and practical considerations for NLP scenarios The post Demystifying Cosine Similarity appeared first on Towards Data Science. Chinmay Kakatkar Go to original source

August 9, 2025
Finding Golden Examples: A Smarter Approach to In-Context Learning

Finding Golden Examples: A Smarter Approach to In-Context Learning From random example selection to systematic AuPair generation — how to make your LLM prompts actually work The post Finding Golden Examples: A Smarter Approach to In-Context Learning appeared first on Towards Data Science. Sudheer Singh Go to original source

August 8, 2025
How to Evaluate Graph Retrieval in MCP Agentic Systems

How to Evaluate Graph Retrieval in MCP Agentic Systems A framework for measuring retrieval quality in Model Context Protocol agents. The post How to Evaluate Graph Retrieval in MCP Agentic Systems appeared first on Towards Data Science. Tomaz Bratanic Go to original source

July 30, 2025
Talk to my Agent

Talk to my Agent The exciting new world of designing conversation driven APIs for LLMs. The post Talk to my Agent appeared first on Towards Data Science. Roni Dover Go to original source

July 29, 2025
How I Fine-Tuned Granite-Vision 2B to Beat a 90B Model — Insights and Lessons Learned

How I Fine-Tuned Granite-Vision 2B to Beat a 90B Model — Insights and Lessons Learned A hands-on journey exploring fine-tuning techniques that unlock the power of small vision models. The post How I Fine-Tuned Granite-Vision 2B to Beat a 90B Model — Insights and Lessons Learned appeared first on Towards Data Science. Julio Sanchez Go…

July 26, 2025
How To Significantly Enhance LLMs by Leveraging Context Engineering

How To Significantly Enhance LLMs by Leveraging Context Engineering The benefits and practical aspects of context engineering for LLMs The post How To Significantly Enhance LLMs by Leveraging Context Engineering appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

July 22, 2025
The Age of Self-Evolving AI Is Here

The Age of Self-Evolving AI Is Here How Meta’s latest breakthrough lets models learn, adapt, and improve — all on their own The post The Age of Self-Evolving AI Is Here appeared first on Towards Data Science. Moulik Gupta Go to original source

July 18, 2025
Your 1M+ Context Window LLM Is Less Powerful Than You Think

Your 1M+ Context Window LLM Is Less Powerful Than You Think Why working memory is a more important bottleneck than raw context window size The post Your 1M+ Context Window LLM Is Less Powerful Than You Think appeared first on Towards Data Science. Tobias Schnabel Go to original source

July 18, 2025
3 Steps to Context Engineering a Crystal-Clear Project

3 Steps to Context Engineering a Crystal-Clear Project Learn three easy steps for gaining an intelligent picture for any project by using the skill of context engineering. The post 3 Steps to Context Engineering a Crystal-Clear Project appeared first on Towards Data Science. Kory Becker Go to original source

July 17, 2025
Do You Really Need a Foundation Model?

Do You Really Need a Foundation Model? LLM or custom model: how should you choose the right solution? The post Do You Really Need a Foundation Model? appeared first on Towards Data Science. Vincent Vandenbussche Go to original source

July 16, 2025
How to Ensure Reliability in LLM Applications

How to Ensure Reliability in LLM Applications Learn how to make your LLM applications more robust The post How to Ensure Reliability in LLM Applications appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

July 16, 2025
From Equal Weights to Smart Weights: OTPO’s Approach to Better LLM Alignment

From Equal Weights to Smart Weights: OTPO’s Approach to Better LLM Alignment Using optimal transport to weight what matters most In LLM-generated responses The post From Equal Weights to Smart Weights: OTPO’s Approach to Better LLM Alignment appeared first on Towards Data Science. Sudheer Singh Go to original source

July 16, 2025
Topic Model Labelling with LLMs

Topic Model Labelling with LLMs Python tutorial for reproducible labeling of cutting-edge topic models with GPT4-o-mini. The post Topic Model Labelling with LLMs appeared first on Towards Data Science. Petr Koráb Go to original source

July 15, 2025
Are You Being Unfair to LLMs?

Are You Being Unfair to LLMs? They may deserve better. The post Are You Being Unfair to LLMs? appeared first on Towards Data Science. Julian Mendel Go to original source

July 12, 2025
Building a Сustom MCP Chatbot

Building a Сustom MCP Chatbot Understanding all the details of the model context protocol The post Building a Сustom MCP Chatbot appeared first on Towards Data Science. Mariya Mansurova Go to original source

July 11, 2025
Your Personal Analytics Toolbox

Your Personal Analytics Toolbox Leveraging MCP for automating your daily routine The post Your Personal Analytics Toolbox appeared first on Towards Data Science. Mariya Mansurova Go to original source

July 8, 2025
Fairness Pruning: Precision Surgery to Reduce Bias in LLMs

Fairness Pruning: Precision Surgery to Reduce Bias in LLMs From unjustified shootings to neutral stories: how to fix toxic narratives with selective pruning The post Fairness Pruning: Precision Surgery to Reduce Bias in LLMs appeared first on Towards Data Science. Pere Martra Go to original source

July 4, 2025
A Developer’s Guide to Building Scalable AI: Workflows vs Agents

A Developer’s Guide to Building Scalable AI: Workflows vs Agents A practical guide to choosing between AI agents and workflows for production systems, covering the hidden costs, architectural trade-offs, and decision framework that can save you thousands in deployment mistakes. Includes real-world examples and a scoring system to determine which approach fits your specific use…

June 28, 2025
How to Train a Chatbot Using RAG and Custom Data

How to Train a Chatbot Using RAG and Custom Data Retrieval-Augmented Generation made easy with Llama The post How to Train a Chatbot Using RAG and Custom Data appeared first on Towards Data Science. Haden Pelletier Go to original source

June 26, 2025
Data Has No Moat!

Data Has No Moat! Only if you ignore data quality The post Data Has No Moat! appeared first on Towards Data Science. Fabiana Clemente Go to original source

June 25, 2025
Agentic AI: Implementing Long-Term Memory

Agentic AI: Implementing Long-Term Memory The problem and current solutions The post Agentic AI: Implementing Long-Term Memory appeared first on Towards Data Science. Ida Silfverskiöld Go to original source

June 25, 2025
Why Your Next LLM Might Not Have A Tokenizer

Why Your Next LLM Might Not Have A Tokenizer The Tokenizer Has Been a Necessary Evil, but This Radical Approach Shows That It Might Not Be Necessary Anymore. The post Why Your Next LLM Might Not Have A Tokenizer appeared first on Towards Data Science. Moulik Gupta Go to original source

June 25, 2025
Reinforcement Learning from Human Feedback, Explained Simply

Reinforcement Learning from Human Feedback, Explained Simply The one technique that made ChatGPT so smart The post Reinforcement Learning from Human Feedback, Explained Simply appeared first on Towards Data Science. Vyacheslav Efimov Go to original source

June 24, 2025
Programming, Not Prompting: A Hands-On Guide to DSPy

Programming, Not Prompting: A Hands-On Guide to DSPy A practical deep dive into declarative AI programming The post Programming, Not Prompting: A Hands-On Guide to DSPy appeared first on Towards Data Science. Mariya Mansurova Go to original source

June 24, 2025
Understanding Application Performance with Roofline Modeling

Understanding Application Performance with Roofline Modeling A common challenge with calculating an application’s performance is that the real-world performance and theoretical performance can differ. With an ecosystem of products that is growing with high performance needs such as High Performance Computing (HPC), gaming, or in the current landscape – Large Language Models (LLMs), it is…

June 21, 2025
LLM-as-a-Judge: A Practical Guide

LLM-as-a-Judge: A Practical Guide How to Scale LLM Evaluations Beyond Manual Review The post LLM-as-a-Judge: A Practical Guide appeared first on Towards Data Science. Shuai Guo Go to original source

June 20, 2025
LLaVA on a Budget: Multimodal AI with Limited Resources

LLaVA on a Budget: Multimodal AI with Limited Resources Let’s get started with multimodality The post LLaVA on a Budget: Multimodal AI with Limited Resources appeared first on Towards Data Science. Marcello Politi Go to original source

June 18, 2025
Build an AI Agent to Explore Your Data Catalog with Natural Language

Build an AI Agent to Explore Your Data Catalog with Natural Language Leverage LLMs to query your Databricks Data Catalog The post Build an AI Agent to Explore Your Data Catalog with Natural Language appeared first on Towards Data Science. Fabiana Clemente Go to original source

June 17, 2025
What If I had AI in 2018: Rent the Runway Fulfillment Center Optimization

What If I had AI in 2018: Rent the Runway Fulfillment Center Optimization An LLM in 2018 would not have trivialized a complex project, although it could have enhanced the final solution The post What If I had AI in 2018: Rent the Runway Fulfillment Center Optimization appeared first on Towards Data Science. Hugo Ducruc…

June 14, 2025
How AI Agents “Talk” to Each Other

How AI Agents “Talk” to Each Other Minimize chaos and maintain inter-agent harmony in your projects The post How AI Agents “Talk” to Each Other appeared first on Towards Data Science. TDS Editors Go to original source

June 14, 2025
Agentic AI 103: Building Multi-Agent Teams

Agentic AI 103: Building Multi-Agent Teams Build multi-agent teams that can automate tasks and enhance productivity. The post Agentic AI 103: Building Multi-Agent Teams appeared first on Towards Data Science. Gustavo Santos Go to original source

June 13, 2025
Design Smarter Prompts and Boost Your LLM Output: Real Tricks from an AI Engineer’s Toolbox

Design Smarter Prompts and Boost Your LLM Output: Real Tricks from an AI Engineer’s Toolbox Not just what you ask, but how you ask it. Practical techniques for prompt engineering that deliver The post Design Smarter Prompts and Boost Your LLM Output: Real Tricks from an AI Engineer’s Toolbox appeared first on Towards Data Science. Ugo Pradère…

June 13, 2025
Can AI Truly Develop a Memory That Adapts Like Ours?

Can AI Truly Develop a Memory That Adapts Like Ours? Exploring Titans: A new architecture equipping LLMs with human-inspired memory that learns and updates itself during test-time. The post Can AI Truly Develop a Memory That Adapts Like Ours? appeared first on Towards Data Science. Moulik Gupta Go to original source

June 12, 2025
LLMs + Pandas: How I Use Generative AI to Generate Pandas DataFrame Summaries

LLMs + Pandas: How I Use Generative AI to Generate Pandas DataFrame Summaries Local Large Language Models can convert massive DataFrames to presentable Markdown reports — here’s how. The post LLMs + Pandas: How I Use Generative AI to Generate Pandas DataFrame Summaries appeared first on Towards Data Science. Dario Radečić Go to original source

June 3, 2025
LLM Optimization: LoRA and QLoRA

LLM Optimization: LoRA and QLoRA Scalable fine-tuning techniques for large language models The post LLM Optimization: LoRA and QLoRA appeared first on Towards Data Science. Vyacheslav Efimov Go to original source

May 31, 2025
GAIA: The LLM Agent Benchmark Everyone’s Talking About

GAIA: The LLM Agent Benchmark Everyone’s Talking About What practitioners need to know about this LLM agent benchmark The post GAIA: The LLM Agent Benchmark Everyone’s Talking About appeared first on Towards Data Science. Shuai Guo Go to original source

May 30, 2025
From Data to Stories: Code Agents for KPI Narratives

From Data to Stories: Code Agents for KPI Narratives HuggingFace’s smolagents framework in action The post From Data to Stories: Code Agents for KPI Narratives appeared first on Towards Data Science. Mariya Mansurova Go to original source

May 29, 2025
Tree of Thought Prompting: Teaching LLMs to Think Slowly

Tree of Thought Prompting: Teaching LLMs to Think Slowly Playing Minesweeper with Augmented Reasoning The post Tree of Thought Prompting: Teaching LLMs to Think Slowly appeared first on Towards Data Science. Shuyang Go to original source

May 29, 2025
Code Agents: The Future of Agentic AI

Code Agents: The Future of Agentic AI HuggingFace smolagents framework in action The post Code Agents: The Future of Agentic AI appeared first on Towards Data Science. Mariya Mansurova Go to original source

May 27, 2025
How to Evaluate LLMs and Algorithms — The Right Way

How to Evaluate LLMs and Algorithms — The Right Way Never miss a new edition of The Variable, our weekly newsletter featuring a top-notch selection of editors’ picks, deep dives, community news, and more. Subscribe today! All the hard work it takes to integrate large language models and powerful algorithms into your workflows can go to waste…

May 24, 2025
Google’s AlphaEvolve: Getting Started with Evolutionary Coding Agents

Google’s AlphaEvolve: Getting Started with Evolutionary Coding Agents Introduction AlphaEvolve [1] is a promising new coding agent by Google’s DeepMind. Let’s look at what it is and why it is generating hype. Much of the Google paper is on the claim that AlphaEvolve is facilitating novel research through its ability to improve code until it solves…

May 23, 2025
Agentic AI 102: Guardrails and Agent Evaluation

Agentic AI 102: Guardrails and Agent Evaluation Introduction In the first post of this series (Agentic AI 101: Starting Your Journey Building AI Agents), we talked about the fundamentals of creating AI Agents and introduced concepts like reasoning, memory, and tools. Of course, that first post touched only the surface of this new area of…

May 17, 2025
Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer

Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer AlphaEvolve imagined as a genetic algorithm coupled to a large language model. Picture created by the author using various tools including Dall-E3 via ChatGPT. Large Language Models have undeniably revolutionized how many of us approach coding, but they’re often more like a super-powered…

May 16, 2025
Empowering LLMs to Think Deeper by Erasing Thoughts

Empowering LLMs to Think Deeper by Erasing Thoughts Introduction Recent large language models (LLMs) — such as OpenAI’s o1/o3, DeepSeek’s R1 and Anthropic’s Claude 3.7 — demonstrate that allowing the model to think deeper and longer at test time can significantly enhance model’s reasoning capability. The core approach underlying their deep thinking capability is called…

May 13, 2025
What My GPT Stylist Taught Me About Prompting Better

What My GPT Stylist Taught Me About Prompting Better When I built a GPT-powered fashion assistant, I expected runway looks—not memory loss, hallucinations, or semantic déjà vu. But what unfolded became a lesson in how prompting really works—and why LLMs are more like wild animals than tools. This article builds on my previous article on…

May 10, 2025
Real-Time Interactive Sentiment Analysis in Python

Real-Time Interactive Sentiment Analysis in Python You know what the best part of being an engineer is? You can just build stuff. It’s like a superpower. One rainy afternoon I had this random idea of creating a sentiment visualization of a text input with a smiley face that changes it’s expression base on how positive…

May 8, 2025
Talking to Kids About AI

Talking to Kids About AI I’ve had the pleasant opportunity recently to be involved with a program called Skype a Scientist, which pairs scientists of various types (biologists, botanists, engineers, computer scientists, etc) with classrooms of kids to talk about our work and answer their questions. I’m pretty familiar with discussing AI and machine learning with…

May 2, 2025
Agentic AI 101: Starting Your Journey Building AI Agents

Agentic AI 101: Starting Your Journey Building AI Agents Introduction The Artificial Intelligence industry is moving fast. It is impressive and many times overwhelming. I have been studying, learning, and building my foundations in this area of Data Science because I believe that the future of Data Science is strongly correlated with the development of…

May 2, 2025
LLM Evaluations: from Prototype to Production

LLM Evaluations: from Prototype to Production Evaluation is the cornerstone of any machine learning product. Investing in quality measurement delivers significant returns. Let’s explore the potential business benefits. As management consultant and writer Peter Drucker once said, “If you can’t measure it, you can’t improve it.” Building a robust evaluation system helps you identify areas…

April 26, 2025
An LLM-Based Workflow for Automated Tabular Data Validation

An LLM-Based Workflow for Automated Tabular Data Validation This article is part of a series of articles on automating data cleaning for any tabular dataset: Effortless Spreadsheet Normalisation With LLM You can test the feature described in this article on your own dataset using the CleanMyExcel.io service, which is free and requires no registration. What…

April 15, 2025
The Invisible Revolution: How Vectors Are (Re)defining Business Success

The Invisible Revolution: How Vectors Are (Re)defining Business Success In a world that focuses more on data, business leaders must understand vector thinking. At first, vectors may appear as complicated as algebra was in school, but they serve as a fundamental building block. Vectors are as essential as algebra for tasks like sharing a bill…

April 11, 2025
Circuit Tracing: A Step Closer to Understanding Large Language Models

Circuit Tracing: A Step Closer to Understanding Large Language Models Context Over the years, Transformer-based large language models (LLMs) have made substantial progress across a wide range of tasks evolving from simple information retrieval systems to sophisticated agents capable of coding, writing, conducting research, and much more. But despite their capabilities, these models are still largely…

April 9, 2025
AI in Social Research and Polling

AI in Social Research and Polling This month, I’m going to be discussing a really interesting topic that I came across in a recent draft paper by a professor at the University of Maryland named M. R. Sauter. In the paper, they discuss (among other things) the phenomenon of social scientists and pollsters trying to employ…

April 2, 2025
Mastering Prompt Engineering with Functional Testing: A Systematic Guide to Reliable LLM Outputs

Mastering Prompt Engineering with Functional Testing: A Systematic Guide to Reliable LLM Outputs Creating efficient prompts for large language models often starts as a simple task… but it doesn’t always stay that way. Initially, following basic best practices seems sufficient: adopt the persona of a specialist, write clear instructions, require a specific response format, and…

March 15, 2025
Are You Still Using LoRA to Fine-Tune Your LLM?

Are You Still Using LoRA to Fine-Tune Your LLM? LoRA (Low Rank Adaptation – arxiv.org/abs/2106.09685) is a popular technique for fine-tuning Large Language Models (LLMs) on the cheap. But 2024 has seen an explosion of new parameter-efficient fine-tuning techniques, an alphabet soup of LoRA alternatives: SVF, SVFT, MiLoRA, PiSSA, LoRA-XS … And most are based…

March 14, 2025
LLM + RAG: Creating an AI-Powered File Reader Assistant

LLM + RAG: Creating an AI-Powered File Reader Assistant Introduction AI is everywhere. It is hard not to interact at least once a day with a Large Language Model (LLM). The chatbots are here to stay. They’re in your apps, they help you write better, they compose emails, they read emails…well, they do a lot.…

March 4, 2025
LLaDA: The Diffusion Model That Could Redefine Language Generation

LLaDA: The Diffusion Model That Could Redefine Language Generation Introduction What if we could make language models think more like humans? Instead of writing one word at a time, what if they could sketch out their thoughts first, and gradually refine them? This is exactly what Large Language Diffusion Models (LLaDA) introduces: a different approach to…

February 27, 2025
AI Agents from Zero to Hero – Part 1

AI Agents from Zero to Hero – Part 1 Intro AI Agents are autonomous programs that perform tasks, make decisions, and communicate with others. Normally, they use a set of tools to help complete tasks. In GenAI applications, these Agents process sequential reasoning and can use external tools (like web searches or database queries) when…

February 21, 2025
Tutorial: Semantic Clustering of User Messages with LLM Prompts

Tutorial: Semantic Clustering of User Messages with LLM Prompts As a Developer Advocate, it’s challenging to keep up with user forum messages and understand the big picture of what users are saying. There’s plenty of valuable content — but how can you quickly spot the key conversations? In this tutorial, I’ll show you an AI…

February 18, 2025
How to Measure the Reliability of a Large Language Model’s Response

How to Measure the Reliability of a Large Language Model’s Response The basic principle of Large Language Models (LLMs) is very simple: to predict the next word (or token) in a sequence of words based on statistical patterns in their training data. However, this seemingly simple capability turns out to be incredibly sophisticated when it…

February 13, 2025
I Tried Making my Own (Bad) LLM Benchmark to Cheat in Escape Rooms

I Tried Making my Own (Bad) LLM Benchmark to Cheat in Escape Rooms Recently, DeepSeek announced their latest model, R1, and article after article came out praising its performance relative to cost, and how the release of such open-source models could genuinely change the course of LLMs forever. That is really exciting! And also, too…

February 8, 2025
Training Large Language Models: From TRPO to GRPO

Training Large Language Models: From TRPO to GRPO Deepseek has recently made quite a buzz in the AI community, thanks to its impressive performance at relatively low costs. I think this is a perfect opportunity to dive deeper into how Large Language Models (LLMs) are trained. In this article, we will focus on the Reinforcement Learning…

February 6, 2025
Supercharge Your RAG with Multi-Agent Self-RAG

Supercharge Your RAG with Multi-Agent Self-RAG Introduction Many of us might have tried to build a RAG application and noticed it falls significantly short of addressing real-life needs. Why is that? It’s because many real-world problems require multiple steps of information retrieval and reasoning. We need our agent to perform those as humans normally do,…

February 6, 2025
From Resume to Cover Letter Using AI and LLM, with Python and Streamlit

From Resume to Cover Letter Using AI and LLM, with Python and Streamlit DISCLAIMER: The idea of doing Cover Letter or even Resume with AI does not obviously start with me. A lot of people have done this before (very successfully) and have built websites and even companies from the idea. This is just a…

February 5, 2025
Improving Agent Systems & AI Reasoning

Improving Agent Systems & AI Reasoning DeepSeek-R1, OpenAI o1 & o3, Test-Time Compute Scaling, Model Post-Training and the Transition to Reasoning Language Models (RLMs) Image by author and GPT-4o meant to represent DeepSeek and other competitive GenAI model providers Introduction Over the past year generative AI adoption and AI Agent development have skyrocketed. Reports from LangChain…

February 3, 2025
Sparse AutoEncoder: from Superposition to interpretable features

Sparse AutoEncoder: from Superposition to interpretable features Disentangle features in complex Neural Network with superpositions Complex neural networks, such as Large Language Models (LLMs), suffer quite often from interpretability challenges. One of the most important reasons for such difficulty is superposition — a phenomenon of the neural network having fewer dimensions than the number of features it…

February 2, 2025
How to Implement Guardrails for Your AI Agents with CrewAI

How to Implement Guardrails for Your AI Agents with CrewAI LLM Agents are non-deterministic by nature: implement proper guardrails for your AI Application. Continue reading on Towards Data Science » Alessandro Romano Go to original source

January 28, 2025
Understanding Emergent Capabilities in LLMs: Lessons from Biological Systems

Understanding Emergent Capabilities in LLMs: Lessons from Biological Systems How natural systems fundamental laws help explain AI’s unexpected abilities Continue reading on Towards Data Science » Javier Marin Go to original source

January 25, 2025
On a Time Crunch but Still Want to Learn to Develop Multi-Agent AI?

On a Time Crunch but Still Want to Learn to Develop Multi-Agent AI? These 3 starter projects only take a weekend (and a few cups of coffee, maybe) Continue reading on Towards Data Science » Thuwarakesh Murallie Go to original source

January 24, 2025
How to Evaluate LLM Summarization

How to Evaluate LLM Summarization A practical and effective guide for evaluating AI summaries Image from Unsplash Summarization is one of the most practical and convenient tasks enabled by LLMs. However, compared to other LLM tasks like question-asking or classification, evaluating LLMs on summarization is far more challenging. And so I myself have neglected evals for…

January 23, 2025
Why Generative-AI Apps’ Quality Often Sucks and What to Do About It

Why Generative-AI Apps’ Quality Often Sucks and What to Do About It How to get from PoCs to tested high-quality applications in production Image licensed from elements.envato.com, edit by Marcel Müller, 2025 The generative AI hype has rolled through the business world in the past two years. This technology can make business process executions more efficient,…

January 21, 2025
How to Use Pre-Trained Language Models for Regression

How to Use Pre-Trained Language Models for Regression Why and how to convert mT5 into a regression metric for numerical prediction Continue reading on Towards Data Science » Aden Haussmann Go to original source

January 19, 2025
What Would a Stoic Do? — An AI-Based Decision-Making Model

What Would a Stoic Do? — An AI-Based Decision-Making Model Using AI to build Marcus Aurelius’ reincarnation Continue reading on Towards Data Science » Pol Marin Go to original source

January 13, 2025
Linearizing Llama

Linearizing Llama Speeding up Llama: A hybrid approach to attention mechanisms Source: Image by Author (Generated using Gemini 1.5 Flash) In this article, we will see how to replace softmax self-attention in Llama-3.2-1B with hybrid attention combining softmax sliding window and linear attention. This implementation will help us better understand the growing interest in linear attention…

January 11, 2025
Building Autonomous Multi-Tool Agents with Gemini 2.0 and LangGraph

Building Autonomous Multi-Tool Agents with Gemini 2.0 and LangGraph A practical tutorial with full code examples for building and running multi-tool agents Continue reading on Towards Data Science » Youness Mansar Go to original source

January 10, 2025
Understanding the Evolution of ChatGPT: Part 1—An In-Depth Look at GPT-1 and What Inspired It

Understanding the Evolution of ChatGPT: Part 1—An In-Depth Look at GPT-1 and What Inspired It Tracing the roots of ChatGPT: GPT-1, the foundation of OpenAI’s LLMs (Image from Unsplash) The GPT (Generative Pre-Training) model family, first introduced by OpenAI in 2018, is another important application of the Transformer architecture. It has since evolved through versions like…

January 8, 2025
AI Agents Hype, Explained — What You Really Need to Know to Get Started

AI Agents Hype, Explained — What You Really Need to Know to Get Started I’ll set the record straight — AI Agents are not new but advanced. Learn how they’ve evolved and where to get started. Continue reading on Towards Data Science » Marc Nehme Go to original source

January 7, 2025
The Next Frontier in LLM Accuracy

The Next Frontier in LLM Accuracy Exploring the Power of Lamini Memory Tuning Image generated by DALL-E 3 Accuracy is often critical for LLM applications, especially in cases such as API calling or summarisation of financial reports. Fortunately, there are ways to enhance precision. The best practices to improve accuracy include the following steps: You can start…

January 5, 2025
Multi-Agentic RAG with Hugging Face Code Agents

Multi-Agentic RAG with Hugging Face Code Agents Using Qwen2.5–7B-Instruct powered code agents to create a local, open source, multi-agentic RAG system Photo by Jaredd Craig on Unsplash Large Language Models have shown impressive capabilities and they are still undergoing steady improvements with each new generation of models released. Applications such as chatbots and summarisation can directly exploit…

January 1, 2025
Building Trust in LLM Answers: Highlighting Source Texts in PDFs

Building Trust in LLM Answers: Highlighting Source Texts in PDFs 100% accuracy isn’t everything: helping users navigate the document is the real value Continue reading on Towards Data Science » Angela & Kezhan Shi Go to original source

December 28, 2024
Linearizing Attention

Linearizing Attention Breaking the quadratic barrier: modern alternatives to softmax attention Large Languange Models are great but they have a slight drawback that they use softmax attention which can be computationally intensive. In this article we will explore if there is a way we can replace the softmax somehow to achieve linear time complexity. Image…

December 27, 2024