{"id":333,"date":"2024-12-03T07:04:40","date_gmt":"2024-12-03T07:04:40","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2024\/12\/03\/google-gemini-is-entering-the-advent-of-code-challenge-dfd88ffa12a6\/"},"modified":"2024-12-03T07:04:40","modified_gmt":"2024-12-03T07:04:40","slug":"google-gemini-is-entering-the-advent-of-code-challenge-dfd88ffa12a6","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2024\/12\/03\/google-gemini-is-entering-the-advent-of-code-challenge-dfd88ffa12a6\/","title":{"rendered":"Google Gemini Is Entering the Advent of Code Challenge"},"content":{"rendered":"<p>    Google Gemini Is Entering the Advent of Code Challenge<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<h4>An open-source project to explore the capabilities and limitations of LLMs on coding challenges<\/h4>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2AesIxuKdYQRkHlT1COUgv8g.jpeg?ssl=1\"><figcaption>Image by author (created with Flux 1.1\u00a0Pro)<\/figcaption><\/figure>\n<h3>What is this\u00a0about?<\/h3>\n<p>If 2024 taught us anything in the realm of Generative AI, then it is that coding is one of the most promising applications for large language models\u00a0(LLMs).<\/p>\n<p>In this blog post, I will describe how I am using one of the most advanced LLMs, Gemini Experimental 1121, which currently leads the LMArena Leaderboard, to tackle the <a href=\"https:\/\/adventofcode.com\/\">Advent of Code<\/a> challenge.<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2Ad_Hk8HfGxfgnbRgX6AONsg.png?ssl=1\"><figcaption>Image by\u00a0author<\/figcaption><\/figure>\n<p>I will outline my approach and share my <a href=\"https:\/\/github.com\/heiko-hotz\/advent-of-code-2024-gemini\">open-source code repository<\/a> so that readers can explore it further and replicate the\u00a0results.<\/p>\n<h3>Why should we\u00a0care?<\/h3>\n<p>There are many reasons why LLMs + Coding is an exciting area, to highlight a\u00a0few:<\/p>\n<ul>\n<li>Code is just like language and can be learned the same way by transformer models<\/li>\n<li>The output is easily validated\u200a\u2014\u200awe can just run the code and check if it\u00a0does<\/li>\n<li>There is a huge demand for code assistant\u200a\u2014\u200athey can easily increase productivity of coders\u00a0manyfold<\/li>\n<\/ul>\n<p>So, this is definitely an interesting and exciting direction and I thought it might be fun to explore it a bit more with a hands-on-challenge.<\/p>\n<h3>The Advent of Code challenge<\/h3>\n<p>For those not familiar with the Advent of Code challenge: It is an annual event that runs from December 1st to December 25th, offering daily programming puzzles similar to an advent calendar. Each day, a new two-part puzzle is released where coders can test their coding and problem-solving skills. It\u2019s a fun way for developers of all levels to practice\u00a0coding.<\/p>\n<p>Both parts of the daily challenge revolve around a similar problem and use the same input data. The idea is to write a Python program that will process the input data and produce a solution (typically a number). Once we run the code and the model calculated the solution, we can take it and paste it into the website, which then will tell us if the solution was correct. If so, the second part will be unlocked with the a similar procedure.<\/p>\n<p>The competition runs for 25 days and allows users to collect a maximum of 50 stars (2 per\u00a0day).<\/p>\n<h3>A great challenge for\u00a0LLMs<\/h3>\n<p>As mentioned above, this is a great challenge for LLMs. We can just take the problem statement and plug it into an LLM of our choice, let it produce the code, run the code, and take the solution that was produced and paste it into the website to see if the LLM was successful.<\/p>\n<p>For this project I\u2019m using Gemini Experimental 1121, which has greatly improved coding and reasoning capabilities. It is available through Google\u2019s <a href=\"https:\/\/aistudio.google.com\/\">AI Studio<\/a>. I use the same system prompt throughout the challenge\u200a\u2014\u200ait is a zero-shot prompt (no chain-of-thought) with the addition that the code should expect the input via input redirection, like\u00a0so:<\/p>\n<pre>python day01\/part1.py &lt; day01\/input.txt<\/pre>\n<p>The system prompt\u00a0is:<\/p>\n<pre>Provide python code to solve a given puzzle.<br>Assume there is an input.txt file that can be read<br>via input redirection in the command line.<\/pre>\n<p>I then post the actual challenge and Gemini will create the code that should produce the correct solution. I copy the code into the GH repo and run it and paste the produced solution into the Advent of Code website to see if it was\u00a0correct.<\/p>\n<h3>The repository<\/h3>\n<p>Each day\u2019s challenge is organized in its own directory:<\/p>\n<pre>dayXX\/<br>\u251c\u2500\u2500 input.txt         # Challenge input<br>\u251c\u2500\u2500 part1-problem.txt # Problem description for part 1<br>\u251c\u2500\u2500 part2-problem.txt # Problem description for part 2<br>\u251c\u2500\u2500 part1.py         # Solution for part 1<br>\u2514\u2500\u2500 part2.py         # Solution for part 2<\/pre>\n<p>The part1 and part2-problem text files contain the problems of the challenge as stated by Advent of Code. I also appended the correct solution to the end of each text\u00a0file:<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2Adc4FjBgbhXyC9-EnELbcWA.png?ssl=1\"><figcaption>Image by\u00a0author<\/figcaption><\/figure>\n<p>The python scripts contain teh code as produced by Gemini. To be fully transparent I also link to the actual conversations so that everyone can see and review the\u00a0steps:<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2AdZfQ7rrkVYH-Ql-RyE-fjw.png?ssl=1\"><figcaption>Image by\u00a0author<\/figcaption><\/figure>\n<p>To see an example of one of these chats, head over to <a href=\"https:\/\/aistudio.google.com\/app\/prompts?state=%7B%22ids%22:%5B%221kkRVShxln7z6qfKgsVEtP20hozJj7YkA%22%5D,%22action%22:%22open%22,%22userId%22:%22105677632504908789218%22,%22resourceKeys%22:%7B%7D%7D&amp;usp=sharing\">my chat with Gemini about the day 1 challenge.<\/a><\/p>\n<p>I will record all the results in a table that will give the readers a good first overview how the model has fared so\u00a0far:<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2AivivI6IOJTmjEQHTAj43Rg.png?ssl=1\"><figcaption>Image by\u00a0author<\/figcaption><\/figure>\n<h3>Example<\/h3>\n<p>To get a better idea what this will look like, let\u2019s have a look at part 1 the day 1 challenge. Here is the problem statement:<\/p>\n<pre>The Chief Historian is always present for the big Christmas sleigh launch, but nobody has seen him in months! Last anyone heard, he was visiting locations that are historically significant to the North Pole; a group of Senior Historians has asked you to accompany them as they check the places they think he was most likely to visit.<br><br>As each location is checked, they will mark it on their list with a star. They figure the Chief Historian must be in one of the first fifty places they'll look, so in order to save Christmas, you need to help them get fifty stars on their list before Santa takes off on December 25th.<br><br>Collect stars by solving puzzles. Two puzzles will be made available on each day in the Advent calendar; the second puzzle is unlocked when you complete the first. Each puzzle grants one star. Good luck!<br><br>You haven't even left yet and the group of Elvish Senior Historians has already hit a problem: their list of locations to check is currently empty. Eventually, someone decides that the best place to check first would be the Chief Historian's office.<br><br>Upon pouring into the office, everyone confirms that the Chief Historian is indeed nowhere to be found. Instead, the Elves discover an assortment of notes and lists of historically significant locations! This seems to be the planning the Chief Historian was doing before he left. Perhaps these notes can be used to determine which locations to search?<br><br>Throughout the Chief's office, the historically significant locations are listed not by name but by a unique number called the location ID. To make sure they don't miss anything, The Historians split into two groups, each searching the office and trying to create their own complete list of location IDs.<br><br>There's just one problem: by holding the two lists up side by side (your puzzle input), it quickly becomes clear that the lists aren't very similar. Maybe you can help The Historians reconcile their lists?<br><br>For example:<br><br>3   4<br>4   3<br>2   5<br>1   3<br>3   9<br>3   3<br>Maybe the lists are only off by a small amount! To find out, pair up the numbers and measure how far apart they are. Pair up the smallest number in the left list with the smallest number in the right list, then the second-smallest left number with the second-smallest right number, and so on.<br><br>Within each pair, figure out how far apart the two numbers are; you'll need to add up all of those distances. For example, if you pair up a 3 from the left list with a 7 from the right list, the distance apart is 4; if you pair up a 9 with a 3, the distance apart is 6.<br><br>In the example list above, the pairs and distances would be as follows:<br><br>The smallest number in the left list is 1, and the smallest number in the right list is 3. The distance between them is 2.<br>The second-smallest number in the left list is 2, and the second-smallest number in the right list is another 3. The distance between them is 1.<br>The third-smallest number in both lists is 3, so the distance between them is 0.<br>The next numbers to pair up are 3 and 4, a distance of 1.<br>The fifth-smallest numbers in each list are 3 and 5, a distance of 2.<br>Finally, the largest number in the left list is 4, while the largest number in the right list is 9; these are a distance 5 apart.<br>To find the total distance between the left list and the right list, add up the distances between all of the pairs you found. In the example above, this is 2 + 1 + 0 + 1 + 2 + 5, a total distance of 11!<br><br>Your actual left and right lists contain many location IDs. What is the total distance between your lists?<br><br>### Solution: 1666427<\/pre>\n<p>When giving this challenge to Gemini it produces this\u00a0code:<\/p>\n<pre>import sys<br><br>def solve():<br>    left_list = []<br>    right_list = []<br>    <br>    for line in sys.stdin:<br>        left, right = map(int, line.strip().split())<br>        left_list.append(left)<br>        right_list.append(right)<br>    <br>    left_list.sort()<br>    right_list.sort()<br>    <br>    total_distance = 0<br>    for i in range(len(left_list)):<br>        total_distance += abs(left_list[i] - right_list[i])<br>    <br>    print(total_distance)<br><br>solve()<\/pre>\n<p>Running this code\u00a0with<\/p>\n<pre>python day01\/part1.py &lt; day01\/input.txt<\/pre>\n<p>gives the\u00a0result:<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2ARLQJPBDdzocVUt2uhvJzKg.png?ssl=1\"><figcaption>Image by\u00a0author<\/figcaption><\/figure>\n<p>As we can see, Gemini\u2019s code produced the correct\u00a0result.<\/p>\n<h4>Bonus: Get an explanation<\/h4>\n<p>As a bonus, we can also ask Gemini to provide an explanation what the code does and why it solves the challenge:<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2AHVyqkwkG73bDq047MTSVXg.png?ssl=1\"><figcaption>Image by\u00a0author<\/figcaption><\/figure>\n<p>You can find the chat with the explanation <a href=\"https:\/\/aistudio.google.com\/app\/prompts?state=%7B%22ids%22:%5B%221E14ypphhy7nDgWvxRy_QVv5AM4YVLaVM%22%5D,%22action%22:%22open%22,%22userId%22:%22105677632504908789218%22,%22resourceKeys%22:%7B%7D%7D&amp;usp=sharing\">here<\/a>.<\/p>\n<h3>Conclusion<\/h3>\n<p>With this project I want to explore how capable state-of-the-art LLMs are currently at solving coding challenges. I have the hypothesis that Gemini (and other LLMs) have gotten good enough to solve most of these challenges. This does, of course, not mean that they are fit (yet) to solve real software challenges that are much more\u00a0complex.<\/p>\n<p>That being said, I was just curious about this and decided to hop onto this fun little project. I hope you enjoy it and it gives you some insight into where we are headed with LLMs + Coding\u00a0\ud83e\udd17<\/p>\n<h3>Heiko Hotz<\/h3>\n<p>\ud83d\udc4b Follow me on <a href=\"https:\/\/heiko-hotz.medium.com\/\">Medium<\/a> and <a href=\"https:\/\/www.linkedin.com\/in\/heikohotz\/\">LinkedIn<\/a> to read more about Generative AI, Machine Learning, and Natural Language Processing.<\/p>\n<p>\ud83d\udc65 If you\u2019re based in London join one of our <a href=\"https:\/\/www.meetup.com\/nlp_london\/\">NLP London\u00a0Meetups<\/a>.<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/750\/0%2ACq-ZbBArQ4czUbnB.png?ssl=1\"><figcaption>Image by\u00a0author<\/figcaption><\/figure>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/medium.com\/_\/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=dfd88ffa12a6\" width=\"1\" height=\"1\" alt=\"\"><\/p>\n<hr>\n<p><a href=\"https:\/\/towardsdatascience.com\/google-gemini-is-entering-the-advent-of-code-challenge-dfd88ffa12a6\">Google Gemini Is Entering the Advent of Code Challenge<\/a> was originally published in <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a> on Medium, where people are continuing the conversation by highlighting and responding to this story.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Heiko Hotz<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/medium.com\/m\/global-identity-2?redirectUrl=https%3A%2F%2Ftowardsdatascience.com%2Fgoogle-gemini-is-entering-the-advent-of-code-challenge-dfd88ffa12a6\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Google Gemini Is Entering the Advent of Code Challenge An open-source project to explore the capabilities and limitations of LLMs on coding challenges Image by author (created with Flux 1.1\u00a0Pro) What is this\u00a0about? If 2024 taught us anything in the realm of Generative AI, then it is that coding is one of the most promising [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,367,365,366,77,87],"tags":[370,369,368],"class_list":["post-333","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-chatgpt","category-coding","category-gemini","category-genai","category-llm","tag-advent","tag-challenge","tag-code"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/333"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=333"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/333\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=333"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=333"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=333"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}