{"id":644,"date":"2024-12-18T07:00:29","date_gmt":"2024-12-18T07:00:29","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2024\/12\/18\/2024-in-review-what-i-got-right-where-i-was-wrong-and-bolder-predictions-for-2025-4092c2d726cd\/"},"modified":"2024-12-18T07:00:29","modified_gmt":"2024-12-18T07:00:29","slug":"2024-in-review-what-i-got-right-where-i-was-wrong-and-bolder-predictions-for-2025-4092c2d726cd","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2024\/12\/18\/2024-in-review-what-i-got-right-where-i-was-wrong-and-bolder-predictions-for-2025-4092c2d726cd\/","title":{"rendered":"2024 in Review: What I Got Right, Where I Was Wrong, and Bolder Predictions for 2025"},"content":{"rendered":"<p>    2024 in Review: What I Got Right, Where I Was Wrong, and Bolder Predictions for 2025<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<h4>What I got right (and wrong) about trends in 2024 and daring to make bolder predictions for the year\u00a0ahead<\/h4>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2A2ORoApoTLQC29I99kwmT2A.png?ssl=1\"><figcaption>AI Buzzword and Trend Bingo (Image by the\u00a0author)<\/figcaption><\/figure>\n<p>In 2023, building AI-powered applications felt full of promise, but the challenges were already starting to show. By 2024, we began experimenting with techniques to tackle the hard realities of making them work in production.<\/p>\n<p>Last year, I <a href=\"https:\/\/towardsdatascience.com\/2023-in-review-recapping-the-post-chatgpt-era-and-what-to-expect-for-2024-bb4357a4e827?sk=0a494f87bd0b0344594fa1e6694773a6\">reviewed the biggest trends in AI in 2023 and made predictions for 2024<\/a>. This year, instead of a timeline, I want to focus on key themes: What trends emerged? Where did I get it wrong? And what can we expect for\u00a02025?<\/p>\n<h3>2024 in\u00a0Review<\/h3>\n<p>If I have to summarize the AI space in 2024, it would be the \u201cCaptain, it\u2019s Wednesday\u201d meme. The amount of major releases this year was overwhelming. I don\u2019t blame anyone in this space who\u2019s feeling exhausted towards the end of this year. It\u2019s been a crazy ride, and it&#8217;s been hard to keep up. Let\u2019s review key themes in the AI space and see if I correctly predicted them last\u00a0year.<\/p>\n<p><iframe loading=\"lazy\" src=\"https:\/\/cdn.embedly.com\/widgets\/media.html?type=text%2Fhtml&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;schema=twitter&amp;url=https%3A\/\/x.com\/whataweekhuh\/status\/1866908509100782044&amp;image=\" width=\"500\" height=\"281\" frameborder=\"0\" scrolling=\"no\"><a href=\"https:\/\/medium.com\/media\/c0e3c3bf8130f6cfc80109837ab4997f\/href\">https:\/\/medium.com\/media\/c0e3c3bf8130f6cfc80109837ab4997f\/href<\/a><\/iframe><\/p>\n<h4>Evaluations<\/h4>\n<p>Let\u2019s start by looking at some generative AI solutions that made it to production. There aren\u2019t many. As a <a href=\"https:\/\/a16z.com\/generative-ai-enterprise-2024\/\">survey by A16Z<\/a> revealed in 2024, companies are still hesitant to deploy generative AI in customer-facing applications. Instead, they feel more confident using it for internal tasks, like document search or chatbots.<\/p>\n<p>So, why aren\u2019t there that many customer-facing generative AI applications in the wild? Probably because we are still figuring out how to evaluate them properly. This was one of my predictions for\u00a02024.<\/p>\n<p>Much of the research involved using another LLM to evaluate the output of an LLM (<a href=\"https:\/\/arxiv.org\/abs\/2411.15594\">LLM-as-a-judge<\/a>). While the approach may be clever, it\u2019s also imperfect due to added cost, introduction of bias, and unreliability.<\/p>\n<p>Looking back, I anticipated we would see this issue solved this year. However, looking at the landscape today, despite being a major topic of discussion, we still haven\u2019t found a reliable way to evaluate generative AI solutions effectively. Although I think LLM-as-a-judge is the only way we\u2019re able to evaluate generative AI solutions at scale, this shows how early we are in this\u00a0field.<\/p>\n<h4>Multimodality<\/h4>\n<p>Although this one might have been obvious to many of you, I didn\u2019t have this on my radar for 2024. With the releases of <a href=\"https:\/\/openai.com\/index\/gpt-4-research\/\">GPT4<\/a>, <a href=\"https:\/\/ai.meta.com\/blog\/llama-3-2-connect-2024-vision-edge-mobile-devices\/\">Llama 3.2<\/a>, and <a href=\"https:\/\/arxiv.org\/abs\/2407.01449\">ColPali<\/a>, multimodal foundation models were a big trend in 2024. While we, developers, were busy figuring out how to make LLMs work in our existing pipelines, researchers were already one step ahead. They were already building foundation models that could handle more than one modality.<\/p>\n<blockquote><p>\u201cThere is *<strong>absolutely no way in hell*<\/strong> we will ever reach human-level AI without getting machines to learn from high-bandwidth sensory inputs, such as vision.\u201d\u200a\u2014\u200a<a href=\"https:\/\/www.linkedin.com\/posts\/yann-lecun_parm-prmshra-on-x-activity-7172266619103080448-iqvP\/?utm_source=share&amp;utm_medium=member_desktop\">Yann\u00a0LeCun<\/a>\n<\/p><\/blockquote>\n<p>Take PDF parsing as an example of multimodal models\u2019 usefulness beyond text-to-image tasks. <a href=\"https:\/\/arxiv.org\/abs\/2407.01449\">ColPali<\/a>\u2019s researchers avoided the difficult steps of OCR and layout extraction by using visual language models (VLMs). Systems like ColPali and <a href=\"https:\/\/huggingface.co\/vidore\/colqwen2-v0.1\">ColQwen2<\/a> process PDFs as images, extracting information directly without pre-processing or chunking. This is a reminder that simpler solutions often come from changing how you frame the\u00a0problem.<\/p>\n<p>Multimodal models are a bigger shift than they might seem. Document search across PDFs is just the beginning. Multimodality in foundation models will unlock entirely new possibilities for applications across industries. With more modalities, AI is no longer just about language\u200a\u2014\u200ait\u2019s about understanding the\u00a0world.<\/p>\n<h4>Fine-tuning open-weight models and quantization<\/h4>\n<p>Open-weight models are closing the performance gap to closed models. Fine-tuning them gives you a performance boost while still being lightweight. Quantization makes these models smaller and more efficient (see also <a href=\"https:\/\/medium.com\/towards-data-science\/towards-green-ai-how-to-make-deep-learning-models-more-efficient-in-production-3b1e7430a14\">Green AI<\/a>) to run anywhere, even on small devices. Quantization pairs well with fine-tuning, especially since fine-tuning language models is inherently challenging (see\u00a0<a href=\"https:\/\/arxiv.org\/abs\/2305.14314\">QLoRA<\/a>).<\/p>\n<p>Together, these trends make it clear that the future isn\u2019t just bigger models\u200a\u2014\u200ait\u2019s smarter\u00a0ones.<\/p>\n<p><iframe loading=\"lazy\" src=\"https:\/\/cdn.embedly.com\/widgets\/media.html?type=text%2Fhtml&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;schema=twitter&amp;url=https%3A\/\/x.com\/maximelabonne\/status\/1816416043511808259\/&amp;image=\" width=\"500\" height=\"281\" frameborder=\"0\" scrolling=\"no\"><a href=\"https:\/\/medium.com\/media\/8ec8db1e6ce5401145753360f0d2f64d\/href\">https:\/\/medium.com\/media\/8ec8db1e6ce5401145753360f0d2f64d\/href<\/a><\/iframe><\/p>\n<p>I don\u2019t think I explicitly mentioned this one and only <a href=\"https:\/\/medium.com\/towards-data-science\/shifting-tides-the-competitive-edge-of-open-source-llms-over-closed-source-llms-aee76018b5c7\">wrote a small piece on this in the second quarter of 2024<\/a>. So, I will not give myself a point\u00a0here.<\/p>\n<h4>AI agents<\/h4>\n<p>This year, AI agents and agentic workflows gained much attention, as Andrew Ng predicted at the beginning of the year. We saw <a href=\"https:\/\/www.langchain.com\/langgraph\">Langchain<\/a> and <a href=\"https:\/\/docs.llamaindex.ai\/en\/stable\/use_cases\/agents\/\">LlamaIndex<\/a> move into incorporating agents, <a href=\"https:\/\/www.crewai.com\/\">CrewAI<\/a> gained a lot of momentum, and OpenAI came out with <a href=\"https:\/\/github.com\/openai\/swarm\">Swarm<\/a>. This is another topic I hadn\u2019t seen coming since I didn\u2019t look into\u00a0it.<\/p>\n<blockquote><p>\u201cI think AI agentic workflows will drive massive AI progress this year\u200a\u2014\u200aperhaps even more than the next generation of foundation models.\u201d\u200a\u2014\u200a<a href=\"https:\/\/x.com\/AndrewYNg\/status\/1770897666702233815?lang=en\">Andrew\u00a0Ng<\/a>\n<\/p><\/blockquote>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2A2tuAvzjy-b6EjLUtRB2GMQ.png?ssl=1\"><figcaption><a href=\"https:\/\/trends.google.com\/trends\/explore?date=2024-01-01%202024-12-16&amp;q=AI%20agents&amp;hl=en-US\">Screenshot from Google Trends for the term \u201cAI agents\u201d in\u00a02024.<\/a><\/figcaption><\/figure>\n<p>Despite the massive interest in AI agents, they can be controversial. First, there is still no clear definition of \u201cAI agent\u201d and its capabilities. Are AI agents just LLMs with access to tools, or do they have other specific capabilities? Second, they come with added latency and cost. I have read many comments saying that agent systems aren\u2019t suitable for production systems due to\u00a0this.<\/p>\n<p>But I think we have already been seeing some agentic pipelines in production with lightweight workflows, such as routing user queries to specific function calls. I think we will continue to see agents in 2025. Hopefully, we will get a clearer definition and\u00a0picture.<\/p>\n<h4>RAG isn\u2019t de*d and retrieval goes mainstream<\/h4>\n<p><a href=\"https:\/\/medium.com\/@iamleonie\/building-retrieval-augmented-generation-systems-be587f42aedb\">Retrieval-Augmented Generation (RAG)<\/a> gained significant attention in 2023 and remained a key topic in 2024, with many new variants emerging. However, it remains a topic of debate. Some argue it\u2019s becoming obsolete with long-context models, while others question whether it\u2019s even a new idea. While I think the criticism of the terminology is justified, I think the concept is here to stay (for a little while at\u00a0least).<\/p>\n<p><iframe loading=\"lazy\" src=\"https:\/\/cdn.embedly.com\/widgets\/media.html?type=text%2Fhtml&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;schema=twitter&amp;url=https%3A\/\/x.com\/weaviate_io\/status\/1866528335884325070&amp;image=\" width=\"500\" height=\"281\" frameborder=\"0\" scrolling=\"no\"><a href=\"https:\/\/medium.com\/media\/777a461a8ac553c94480911a25237210\/href\">https:\/\/medium.com\/media\/777a461a8ac553c94480911a25237210\/href<\/a><\/iframe><\/p>\n<p>Every time a new long context model is released, some people predict it will be the end of RAG pipelines. I don\u2019t think that\u2019s going to happen. This whole discussion should be a blog post of its own, so I\u2019m not going into depth here and saving the discussion for another one. Let me just say that I don\u2019t think it\u2019s one or the other. They are complements. Instead, we will probably be using long context models together with RAG pipelines.<\/p>\n<p>Also, having a database in applications is not a new concept. The term \u2018RAG,\u2019 which refers to retrieving information from a knowledge source to enhance an LLM\u2019s output, has faced criticism. Some argue it\u2019s merely a rebranding of techniques long used in other fields, such as software engineering. While I think we will probably part from the term in the long run, the technique is here to\u00a0stay.<\/p>\n<p>Despite predictions of RAG\u2019s demise, retrieval remains a cornerstone of AI pipelines. While I may be biased by my work in retrieval, it felt like this topic became more mainstream in AI this year. It started with many discussions around keyword search (BM25) as a baseline for RAG pipelines. It then evolved into a larger discussion around dense retrieval models, such as <a href=\"https:\/\/arxiv.org\/abs\/2004.12832\">ColBERT<\/a> or\u00a0ColPali.<\/p>\n<h4>Knowledge graphs<\/h4>\n<p>I completely missed this topic because I\u2019m not too familiar with it. Knowledge graphs in RAG systems (e.g., Graph RAG) were another big topic. Since all I can say about knowledge graphs at this moment is that they seem to be a powerful external knowledge source, I will keep this section\u00a0short.<\/p>\n<p>The key topics of 2024 suggest that we are now realizing the limitations of building applications with foundation models. The hype around ChatGPT may have settled, but the drive to integrate foundation models into applications is still very much alive. It\u2019s just way more difficult than we had anticipated.<\/p>\n<blockquote><p>\u201c The race to make AI more efficient and more useful, before investors lose their enthusiasm, is on.\u201d\u200a\u2014\u200a<a href=\"https:\/\/www.economist.com\/the-world-ahead\/2024\/11\/18\/will-the-bubble-burst-for-ai-in-2025-or-will-it-start-to-deliver\">The Economist<\/a>\n<\/p><\/blockquote>\n<p>2024 taught us that scaling foundation models isn\u2019t enough. We need better evaluation, smarter retrieval, and more efficient workflows to make AI useful. The limitations we ran into this year aren\u2019t signs of stagnation\u200a\u2014\u200athey\u2019re clues about what we need to fix next in\u00a02025.<\/p>\n<h3>My 2025 Predictions<\/h3>\n<p>OK, now, for the interesting part, what are my 2025 predictions? This year, I want to make some bolder predictions for the next year to make it a little more\u00a0fun:<\/p>\n<ul>\n<li>\n<strong>Video will be an important modality: <\/strong>After text-only LLMs evolved into multimodal foundation models (mostly text and images), it\u2019s only natural that video will be the next modality. I can imagine seeing more video-capable foundation models follow in <a href=\"https:\/\/openai.com\/sora\/\">Sora<\/a>\u2019s footsteps.<\/li>\n<li>\n<strong>From one-shot to agentic to human-in-the-loop: <\/strong>I imagine we will start incorporating humans into AI-powered systems. While we started with one-shot systems, we are not at the stage of having AI agents coordinate different tasks to improve results. But AI agents won\u2019t replace humans. They\u2019ll empower them. Systems that incorporate human feedback will deliver better outcomes across industries. In the long-term, I imagine that we will have to have systems that wait for human feedback before taking action on the next\u00a0task.<\/li>\n<li>\n<strong>Fusion of AI and crypto:<\/strong> Admittedly, I don\u2019t know much about the entire crypto scene, but I saw <a href=\"https:\/\/x.com\/brian_armstrong\/status\/1824547713012080806\">this Tweet by Brian Armstrong about how AI agents should be equipped with crypto wallets<\/a>. Also, concepts like <a href=\"https:\/\/tokenomicsexplained.com\/depin\/\">DePin (Decentralized Physical Infrastructure)<\/a> could be interesting to explore for model training and inference. While this sounds like buzzword bingo, I\u2019m curious to see if early experiments will show if this is hype or\u00a0reality.<\/li>\n<li>\n<strong>Latency and cost per token will drop:<\/strong> Currently, one big issue for AI agents is added latency and cost. However, with Moore\u2019s law and research for making AI models more efficient, like quantization and efficient training techniques (not only for cost reasons but also for environmental reasons), I can imagine both the latency and cost per token going\u00a0down.<\/li>\n<\/ul>\n<p><strong>I am curious to hear your predictions for the AI space in\u00a02025!<\/strong><\/p>\n<p>PS: Funnily, I was researching recipes for Christmas cookies with ChatGPT a few days ago instead of using Google, which I was wondering about two years ago when ChatGPT was released.<\/p>\n<p><a href=\"https:\/\/medium.com\/geekculture\/will-we-be-using-chatgpt-instead-of-google-to-get-a-christmas-cookie-recipe-next-year-45360d4a1178\">Will We Be Using ChatGPT Instead of Google To Get a Christmas Cookie Recipe Next Year?<\/a><\/p>\n<h3>Enjoyed This\u00a0Story?<\/h3>\n<p><a href=\"https:\/\/medium.com\/subscribe\/@iamleonie\"><em>Subscribe for free<\/em><\/a><em> to get notified when I publish a new\u00a0story.<\/em><\/p>\n<p><a href=\"https:\/\/medium.com\/@iamleonie\/subscribe\">Get an email whenever Leonie Monigatti publishes.<\/a><\/p>\n<p><em>Find me on <\/em><a href=\"https:\/\/www.linkedin.com\/in\/804250ab\/\"><em>LinkedIn<\/em><\/a>,<em> <\/em><a href=\"https:\/\/twitter.com\/helloiamleonie\"><em>Twitter<\/em><\/a><em>, and\u00a0<\/em><a href=\"https:\/\/www.kaggle.com\/iamleonie\"><em>Kaggle<\/em><\/a><em>!<\/em><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/medium.com\/_\/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=4092c2d726cd\" width=\"1\" height=\"1\" alt=\"\"><\/p>\n<hr>\n<p><a href=\"https:\/\/towardsdatascience.com\/2024-in-review-what-i-got-right-where-i-was-wrong-and-bolder-predictions-for-2025-4092c2d726cd\">2024 in Review: What I Got Right, Where I Was Wrong, and Bolder Predictions for 2025<\/a> was originally published in <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a> on Medium, where people are continuing the conversation by highlighting and responding to this story.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Leonie Monigatti<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/medium.com\/m\/global-identity-2?redirectUrl=https%3A%2F%2Ftowardsdatascience.com%2F2024-in-review-what-i-got-right-where-i-was-wrong-and-bolder-predictions-for-2025-4092c2d726cd\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>2024 in Review: What I Got Right, Where I Was Wrong, and Bolder Predictions for 2025 What I got right (and wrong) about trends in 2024 and daring to make bolder predictions for the year\u00a0ahead AI Buzzword and Trend Bingo (Image by the\u00a0author) In 2023, building AI-powered applications felt full of promise, but the challenges [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,69,83,70,249,280],"tags":[98,752],"class_list":["post-644","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-artificial-intelligence","category-data-science","category-machine-learning","category-notes-from-industry","category-technology","tag-ai","tag-year"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/644"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=644"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/644\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=644"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=644"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=644"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}