Category: deepseek

How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals

How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals The recent launch of the DeepSeek-R1 model sent ripples across the global AI community. It delivered breakthroughs on par with the reasoning models from Meta and OpenAI, achieving this in a fraction of the time and at a significantly lower cost. Beyond…

April 24, 2025
How LLMs Work: Reinforcement Learning, RLHF, DeepSeek R1, OpenAI o1, AlphaGo

How LLMs Work: Reinforcement Learning, RLHF, DeepSeek R1, OpenAI o1, AlphaGo Welcome to part 2 of my LLM deep dive. If you’ve not read Part 1, I highly encourage you to check it out first. Previously, we covered the first two major stages of training an LLM: Pre-training — Learning from massive datasets to form a base…

February 28, 2025
Training Large Language Models: From TRPO to GRPO

Training Large Language Models: From TRPO to GRPO Deepseek has recently made quite a buzz in the AI community, thanks to its impressive performance at relatively low costs. I think this is a perfect opportunity to dive deeper into how Large Language Models (LLMs) are trained. In this article, we will focus on the Reinforcement Learning…

February 6, 2025
DeepSeek V3: A New Contender in AI-Powered Data Science

DeepSeek V3: A New Contender in AI-Powered Data Science How DeepSeek’s budget-friendly AI model stacks up against ChatGPT, Claude, and Gemini in SQL, EDA, and machine learning Continue reading on Towards Data Science » Yu Dong Go to original source

February 2, 2025