{"id":306,"date":"2024-12-02T07:06:05","date_gmt":"2024-12-02T07:06:05","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2024\/12\/02\/smaller-is-smarter-89a9b3a5ad9e\/"},"modified":"2024-12-02T07:06:05","modified_gmt":"2024-12-02T07:06:05","slug":"smaller-is-smarter-89a9b3a5ad9e","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2024\/12\/02\/smaller-is-smarter-89a9b3a5ad9e\/","title":{"rendered":"Smaller is smarter"},"content":{"rendered":"<p>    Smaller is smarter<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p>Concerns about the environmental impacts of Large Language Models (LLMs) are growing. Although detailed information about the actual costs of LLMs can be difficult to find, let\u2019s attempt to gather some facts to understand the\u00a0scale.<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2A9tyzhcJGdDxLBo_-dsoAqQ.png?ssl=1\"><figcaption>Generated with ChatGPT-4o<\/figcaption><\/figure>\n<p>Since comprehensive data on ChatGPT-4 is not readily available, we can consider Llama 3.1 405B as an example. This open-source model from Meta is arguably the most \u201ctransparent\u201d LLM to date. Based on various <a href=\"https:\/\/ai.meta.com\/blog\/meta-llama-3-1\/\">benchmarks<\/a>, Llama 3.1 405B is comparable to ChatGPT-4, providing a reasonable basis for understanding LLMs within this\u00a0range.<\/p>\n<h3>Inference<\/h3>\n<p>The hardware requirements to run the 32-bit version of this model range from 1,620 to 1,944 GB of GPU memory, depending on the source (<a href=\"https:\/\/www.substratus.ai\/blog\/llama-3-1-405b-gpu-requirements\">substratus<\/a>, <a href=\"https:\/\/huggingface.co\/blog\/llama31\">HuggingFace<\/a>). For a conservative estimate, let\u2019s use the lower figure of 1,620 GB. To put this into perspective\u200a\u2014\u200aacknowledging that this is a simplified analogy\u200a\u2014\u200a1,620 GB of GPU memory is roughly equivalent to the combined memory of 100 standard MacBook Pros (16GB each). So, when you ask one of these LLMs for a tiramisu recipe in Shakespearean style, it takes the power of 100 MacBook Pros to give you an\u00a0answer.<\/p>\n<h3>Training<\/h3>\n<p>I\u2019m attempting to translate these figures into something more tangible\u2026 though this doesn\u2019t include the <a href=\"https:\/\/www.techtarget.com\/searchenterpriseai\/news\/366596503\/Meta-intros-its-biggest-open-source-AI-model-Llama-31-405B#:~:text=Meta%20said%20that%20to%20train,to%20train%20the%20new%20model.\">training costs<\/a>, which are estimated to involve around 16,000 GPUs at an approximate cost of $60 million USD (excluding hardware costs)\u200a\u2014\u200aa significant investment from Meta\u200a\u2014\u200ain a process that took around 80 days. In terms of electricity consumption, <a href=\"https:\/\/www.notebookcheck.net\/Meta-unveils-biggest-smartest-royalty-free-Llama-3-1-405B-AI.866775.0.html\">training required 11\u00a0GWh<\/a>.<\/p>\n<p>The <a href=\"https:\/\/www.data.gouv.fr\/fr\/reuses\/consommation-par-habitant-et-par-ville-delectricite-en-france\/\">annual electricity consumption per person<\/a> in a country like France is approximately 2,300 kWh. Thus, 11 GWh corresponds to the yearly electricity usage of about 4,782 people. This consumption resulted in the release of approximately 5,000 tons of CO\u2082-equivalent greenhouse gases (<a href=\"https:\/\/www.econologie.com\/europe-emissions-co2-pays-kwh-electrique\/\">based on the European average<\/a>),\u00a0, although this figure can easily double depending on the country where the model was\u00a0trained.<\/p>\n<p>For comparison, burning 1 liter of diesel produces 2.54 kg of CO\u2082. Therefore, training Llama 3.1 405B\u200a\u2014\u200ain a country like France\u200a\u2014\u200ais roughly equivalent to the emissions from burning around 2 million liters of diesel. This translates to approximately 28 million kilometers of car travel. I think that provides enough perspective\u2026 and I haven\u2019t even mentioned the water required to cool the\u00a0GPUs!<\/p>\n<h3>Sustainability<\/h3>\n<p>Clearly, AI is still in its infancy, and we can anticipate more optimal and sustainable solutions to emerge over time. However, in this intense race, OpenAI\u2019s financial landscape highlights a significant disparity between its revenues and operational expenses, particularly in relation to inference costs. In 2024, the company is projected to spend approximately $4 billion on processing power provided by Microsoft for inference workloads, while its annual revenue is estimated to range between $3.5 billion and $4.5 billion. This means that inference costs alone nearly match\u200a\u2014\u200aor even exceed\u200a\u2014\u200aOpenAI\u2019s total revenue (<a href=\"https:\/\/www.deeplearning.ai\/the-batch\/openai-faces-financial-growing-pains-spending-double-its-revenue\/\">deeplearning.ai<\/a>).<\/p>\n<p>All of this is happening in a context where experts are announcing a performance plateau for AI models (scaling paradigm). Increasing model size and GPUs are yielding significantly diminished returns compared to previous leaps, such as the advancements GPT-4 achieved over GPT-3. \u201cThe pursuit of AGI has always been unrealistic, and the \u2018bigger is better\u2019 approach to AI was bound to hit a limit eventually\u200a\u2014\u200aand I think this is what we\u2019re seeing here\u201d said <a href=\"https:\/\/www.france24.com\/en\/live-news\/20241118-is-ai-s-meteoric-rise-beginning-to-slow\">Sasha Luccioni<\/a>, researcher and AI lead at startup Hugging\u00a0Face.<\/p>\n<h3>And now?<\/h3>\n<p>But don\u2019t get me wrong\u200a\u2014\u200aI\u2019m not putting AI on trial, because I love it! This research phase is absolutely a normal stage in the development of AI. However, I believe we need to exercise common sense in how we use AI: we can\u2019t use a bazooka to kill a mosquito every time. AI must be made sustainable\u200a\u2014\u200anot only to protect our environment but also to address social divides. Indeed, the risk of leaving the Global South behind in the AI race due to high costs and resource demands would represent a significant failure in this new intelligence revolution..<\/p>\n<p>So, do you really need the full power of ChatGPT to handle the simplest tasks in your RAG pipeline? Are you looking to control your operational costs? Do you want complete end-to-end control over your pipeline? Are you concerned about your private data circulating on the web? Or perhaps you\u2019re simply mindful of AI\u2019s impact and committed to its conscious use?<\/p>\n<h3>SLM can be a smarter\u00a0choice!<\/h3>\n<p>Small language models (SLMs) offer an excellent alternative worth exploring. They can run on your local infrastructure and, when combined with human intelligence, deliver substantial value. Although there is no universally agreed definition of an SLM\u200a\u2014\u200ain 2019, for instance, GPT-2 with its 1.5 billion parameters was considered an LLM, which is no longer the case\u200a\u2014\u200aI am referring to models such as Mistral 7B, Llama-3.2 3B, or Phi3.5, to name a few. These models can operate on a \u201cgood\u201d computer, resulting in a much smaller carbon footprint while ensuring the confidentiality of your data when installed on-premise. Although they are less versatile, when used wisely for specific tasks, they can still provide significant value\u200a\u2014\u200awhile being more environmentally virtuous.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/medium.com\/_\/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=89a9b3a5ad9e\" width=\"1\" height=\"1\" alt=\"\"><\/p>\n<hr>\n<p><a href=\"https:\/\/towardsdatascience.com\/smaller-is-smarter-89a9b3a5ad9e\">Smaller is smarter<\/a> was originally published in <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a> on Medium, where people are continuing the conversation by highlighting and responding to this story.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Alexandre Allouin<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/medium.com\/m\/global-identity-2?redirectUrl=https%3A%2F%2Ftowardsdatascience.com%2Fsmaller-is-smarter-89a9b3a5ad9e\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Smaller is smarter Concerns about the environmental impacts of Large Language Models (LLMs) are growing. Although detailed information about the actual costs of LLMs can be difficult to find, let\u2019s attempt to gather some facts to understand the\u00a0scale. Generated with ChatGPT-4o Since comprehensive data on ChatGPT-4 is not readily available, we can consider Llama 3.1 [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[151,62,83,317,167],"tags":[320,318,319],"class_list":["post-306","post","type-post","status-publish","format-standard","hentry","category-ai","category-aimldsaimlds","category-data-science","category-responsible-ai","category-small-language-model","tag-about","tag-llms","tag-training"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/306"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=306"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/306\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=306"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=306"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=306"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}