{"id":2636,"date":"2025-03-26T07:02:22","date_gmt":"2025-03-26T07:02:22","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/03\/26\/testing-the-power-of-multimodal-ai-systems-in-reading-and-interpreting-photographs-maps-charts-and-more\/"},"modified":"2025-03-26T07:02:22","modified_gmt":"2025-03-26T07:02:22","slug":"testing-the-power-of-multimodal-ai-systems-in-reading-and-interpreting-photographs-maps-charts-and-more","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/03\/26\/testing-the-power-of-multimodal-ai-systems-in-reading-and-interpreting-photographs-maps-charts-and-more\/","title":{"rendered":"Testing the Power of Multimodal AI Systems in Reading and Interpreting Photographs, Maps, Charts and\u00a0More"},"content":{"rendered":"<p>    Testing the Power of Multimodal AI Systems in Reading and Interpreting Photographs, Maps, Charts and\u00a0More<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<h2 class=\"wp-block-heading\"><mdspan datatext=\"el1742844644136\" class=\"mdspan-comment\">Introduction<\/mdspan><\/h2>\n<p class=\"wp-block-paragraph\" id=\"a828\">It\u2019s no news that artificial intelligence has made huge strides in recent years, particularly with the advent of multimodal models that can process and create both text and images, and some very new ones that also process and produce audio and video. I think that these <a href=\"https:\/\/towardsdatascience.com\/tag\/ai-systems\/\" title=\"Ai Systems\">Ai Systems<\/a> have the potential to revolutionize data analysis, robotics, and even everyday tasks like navigation and information extraction from visuals. Along these lines, I recently posed myself the following question:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"e36a\">Can multimodal AI systems consisting in large language models with vision capabilities understand figures that contain information, then process it and produce summaries, explanations, object identification, etc.?<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"7a31\">Although this is in itself a question that would require a whole research project to be properly addressed, I was extremely curious about it, so I needed to get at least a rough approximation to my question. Therefore, I carried out some tests on how much OpenAI\u2019s vision-enhanced models understand about photographs, screenshots and other images containing plots, charts, shots from a driver\u2019s position or mimicking a robot\u2019s onboard camera, and even molecular structures.<\/p>\n<p class=\"wp-block-paragraph\" id=\"3e1d\">More specifically, I tried the GPT-4o and GPT-4o-mini models developed by OpenAI, through ChatGPT or through their API. To my amazement, I found that these models can indeed understand quite a bit of what they see!<\/p>\n<p class=\"wp-block-paragraph\" id=\"c116\">Let me present here the most interesting results I obtained, either directly through ChatGPT or programmatically, including code for a web app where you can paste a picture and have GPT-4o-mini analyze it.<\/p>\n<h3 class=\"wp-block-heading\">First of All: How to Analyze Images With OpenAI\u2019s GPT-4o Models, Via ChatGPT or Programmatically<\/h3>\n<p class=\"wp-block-paragraph\">I carried out my first tests right away on ChatGPT\u2019s free version using GPT-4o, which allows up to 4 chances per day to use the \u201cvision\u201d capabilities for free. To use this you must upload or paste from clipboard a picture that will be sent together with the prompt. On a paid ChatGPT account you can do this more times, and with the OpenAI API you can do this as long as token credits are available, and directly programmatically as I will cover here.<\/p>\n<h3 class=\"wp-block-heading\" id=\"58a6\">Processing prompts containing pictures programmatically<\/h3>\n<p class=\"wp-block-paragraph\" id=\"06a7\">But how to do image processing with GPT-4o models programmatically? Well, it isn\u2019t that complex, as explained at\u00a0<a href=\"https:\/\/platform.openai.com\/docs\/guides\/images?api-mode=chat&amp;lang=javascript&amp;format=base64-encoded\" rel=\"noreferrer noopener\" target=\"_blank\">OpenAI\u2019s API reference<\/a>.<\/p>\n<p class=\"wp-block-paragraph\" id=\"4d75\">In JavaScript, which I love because it allows me to easily write programs that run out of the box online as you already know if you follow me, you just need a function call that looks like this:<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-javascript\">async function sendToOpenAI(imageData) {\n    const apiKey = \"...API KEY HERE\";\n    const base64Image = imageData.split(\",\")[1];\n    \n    const response = await fetch(\"https:\/\/api.openai.com\/v1\/chat\/completions\", {\n        method: \"POST\",\n        headers: {\n            \"Content-Type\": \"application\/json\",\n            \"Authorization\": `Bearer ${apiKey}`\n        },\n        body: JSON.stringify({\n            model: \"gpt-4o-mini\",\n            messages: [{\n                role: \"user\",\n                content: [\n                    { type: \"text\", text: \"what's in this image?\" },\n                    { type: \"image_url\", image_url: { url: `data:image\/jpeg;base64,${base64Image}` } }\n                ]\n            }]\n        })\n    });\n    \n    const result = await response.json();\n    document.getElementById(\"response\").innerText = result.choices[0].message.content;\n}<\/code><\/pre>\n<h3 class=\"wp-block-heading\" id=\"25b0\">A web app to quickly run image-text prompts<\/h3>\n<p class=\"wp-block-paragraph\" id=\"e1c1\">In this example web app you can provide an API key and a text prompt, and then paste an image from the clipboard that will be immediately sent to OpenAI for analysis together with the prompt.<\/p>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/lucianoabriata.altervista.org\/tests\/gpt-vision\/gpt-4o-mini-vision.html\">https:\/\/lucianoabriata.altervista.org\/tests\/gpt-vision\/gpt-4o-mini-vision.html<\/a><\/p>\n<p class=\"wp-block-paragraph\">The web app ends up displaying the answer, as in this example:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" height=\"654\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/1_IuAuQqpgjb9BCwvD3wvI7g-1024x654.webp?resize=1024%2C654&#038;ssl=1\" alt=\"\" class=\"wp-image-600098\"><figcaption class=\"wp-element-caption\">This and all other images, photographs and screenshots shown in this article are by the author.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">(Keep reading to find towards the end of the post how Google\u2019s Gemini 2.0 interpreted this screenshot, which was just mind-blowingly correct as it captured the artist\u2019s actual intention!)<\/p>\n<h3 class=\"wp-block-heading\">Challenging GPT-4o Models With Varied Examples of Image-Text Prompts<\/h3>\n<p class=\"wp-block-paragraph\" id=\"9006\">Using the above app to run GPT-4o-mini and ChatGPT\u2019s free version to run GPT-4o, I evaluated how these models can understand maps, photographs that mimic what a driver sees on the road, camera views about the position of a robot\u2019s arm relative to some target objects, various different kinds of plots and charts presenting data, and software screenshots including some that display molecules.<\/p>\n<p class=\"wp-block-paragraph\">I am stunned at how much the models understand! Read on to discover it all, also take a look at my first tests with the impressive Gemini 2.0 Flash from Google, and perhaps then go try your favorite AI models with your own ideas, sharing in the comments what you find!<\/p>\n<h2 class=\"wp-block-heading\">1. Vision-Based Analysis of a Google Maps Screenshot<\/h2>\n<p class=\"wp-block-paragraph\" id=\"e1ba\">Sometimes, maps contain so much information that you kinda get lost around. Could a multimodal system be of help in such situation? Let\u2019s find out!<\/p>\n<p class=\"wp-block-paragraph\">Here on the left is a screenshot from a Google Maps view that I fed into GPT-4o-mini via the API together with the question \u201cWhat\u2019s in this image?\u201d . You can see the answer on the right:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" height=\"335\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/1_liB8AR-oZa7Pr3b9rmw1GA-1024x335.webp?resize=1024%2C335&#038;ssl=1\" alt=\"\" class=\"wp-image-600099\"><\/figure>\n<h2 class=\"wp-block-heading\" id=\"a1f0\">2. Interpreting Driving Signs in a Road<\/h2>\n<p class=\"wp-block-paragraph\" id=\"3abb\">Could perhaps multimodal systems help you to drive? Of course this deserves a full investigation in order to draw a solid conclusion, but let\u2019s see what happens in some quick tests. In each test, I provided a photograph of what a driver sees in the road on the go, together with this text prompt :<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"91bb\">Interpret any road signs or things I should pay attention to as a driver<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"9b8f\">Here are the results on some example shots, all run with the web app provided above i.e. using GPT-4o-mini:<\/p>\n<p class=\"wp-block-paragraph\" id=\"3c27\">First, for this photograph:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" height=\"601\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/1_p3_84SRHMN3Wja7WIfRGhA-1024x601.webp?resize=1024%2C601&#038;ssl=1\" alt=\"\" class=\"wp-image-600100\"><\/figure>\n<p class=\"wp-block-paragraph\" id=\"e55b\">GPT-4o-mini said the following:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"8dc2\">As you drive, pay attention to the following signs and features in the image:<\/p>\n<p class=\"wp-block-paragraph\" id=\"6770\">1. **\u201dKeep Right\u201d Sign**: This indicates that you should stay to the right side of the road, likely because of an island or obstruction ahead.<\/p>\n<p class=\"wp-block-paragraph\" id=\"9b85\">2. **Curved Road Ahead**: The layout of the road suggests you may encounter a curve, which means you should reduce your speed and stay alert for any oncoming vehicles.<\/p>\n<p class=\"wp-block-paragraph\" id=\"2159\">3. **Speed Limit**: The sign showing a speed limit of 40 suggests that you should not exceed this speed.<\/p>\n<p class=\"wp-block-paragraph\" id=\"1c6f\">4. **Road Layout**: Be mindful of the marking lines and how they direct traffic. The two solid yellow lines indicate that crossing them (to pass another vehicle) is prohibited.<\/p>\n<p class=\"wp-block-paragraph\" id=\"2501\">5. **Surroundings**: Note the vegetation and landscape along the road, which may obscure visibility in some areas. Maintain awareness of pedestrians or cyclists.<\/p>\n<p class=\"wp-block-paragraph\" id=\"7573\">Make sure to remain vigilant and adhere to these guidelines for safe driving.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"a190\">This answer was quite nice, complete, and useful!<\/p>\n<p class=\"wp-block-paragraph\" id=\"0024\">Let\u2019s try this next one, in a densely snowed place:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" height=\"576\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/1_nR5azOm6zTuxTw8fLAJGYA-1024x576.webp?resize=1024%2C576&#038;ssl=1\" alt=\"\" class=\"wp-image-600101\"><\/figure>\n<p class=\"wp-block-paragraph\" id=\"0a0f\">Here it clearly picked up the snow, and correctly inferred it is a region with mountains, offering some quite general advice that suits the situation but doesn\u2019t seem very specific to it:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"7d02\">When driving in snowy mountain conditions, there are several important things to pay attention to:<\/p>\n<p class=\"wp-block-paragraph\" id=\"8999\">1. **Yield Sign**: If there\u2019s a yield sign, be prepared to slow down and give way to any traffic on the intersecting road.<\/p>\n<p class=\"wp-block-paragraph\" id=\"6d35\">2. **Road Conditions**: Snow can make roads slick and reduce visibility. Drive cautiously and maintain a safe distance from other vehicles.<\/p>\n<p class=\"wp-block-paragraph\" id=\"6408\">3. **Speed Limit**: Always adhere to the posted speed limit but consider reducing your speed further due to the weather conditions.<\/p>\n<p class=\"wp-block-paragraph\" id=\"6cd8\">4. **Slippery Surfaces**: Be aware that certain areas may be icier or more treacherous. Watch for signs indicating slippery roads or do not enter areas.<\/p>\n<p class=\"wp-block-paragraph\" id=\"c7c4\">5. **Falling Rocks**: In mountainous areas, watch for any signs indicating potential rockfall or land slides, which may be exacerbated by snow.<\/p>\n<p class=\"wp-block-paragraph\" id=\"7cb1\">6. **Wildlife**: In rural or mountainous regions, be alert for wildlife crossing the road, especially in winter months.<\/p>\n<p class=\"wp-block-paragraph\" id=\"ac6b\">7. **Visibility**: If visibility is low due to snow, be sure to use your headlights and reduce your speed.<\/p>\n<p class=\"wp-block-paragraph\" id=\"7b8f\">Always ensure that your vehicle is equipped for winter conditions, including snow tires or chains if necessary.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"27a0\">Let\u2019s try a last one, which features a tunnel, a car right ahead of the driver, some warnings on the left, and two sign posts on the right:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" height=\"525\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/1_jtJD3MJXQmFl715mQlzAfw-1024x525.webp?resize=1024%2C525&#038;ssl=1\" alt=\"\" class=\"wp-image-600102\"><\/figure>\n<p class=\"wp-block-paragraph\" id=\"3371\">Here\u2019s the answer, which seems to be warning about all general and situation-specific issues:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"d1c7\">Here are some important points to note based on the image:<\/p>\n<p class=\"wp-block-paragraph\" id=\"e81f\">1. **Tunnel Ahead**: You are approaching a tunnel. Prepare for possible changes in visibility and ensure your vehicle\u2019s headlights are on.<\/p>\n<p class=\"wp-block-paragraph\" id=\"236c\">2. **Speed and Space**: Always maintain a safe distance from the vehicle in front of you, especially in confined spaces like tunnels.<\/p>\n<p class=\"wp-block-paragraph\" id=\"b0d2\">3. **No Pedestrians or Bicycles**: The sign indicates that pedestrians and bicycles are not allowed in the area. Stay alert to ensure you don\u2019t encounter anyone on the road.<\/p>\n<p class=\"wp-block-paragraph\" id=\"39f8\">4. **Road Barriers and Traffic Flow**: Pay attention to the road markings and barriers as they guide lane usage and safe navigation near the tunnel entrance.<\/p>\n<p class=\"wp-block-paragraph\" id=\"a67e\">5. **Weather and Terrain**: The surrounding scenery suggests that weather conditions may change quickly, especially driving in mountain areas. Be prepared for changes in road conditions.<\/p>\n<p class=\"wp-block-paragraph\" id=\"64a7\">Always drive cautiously and obey any traffic signs or signals present.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"2d83\">Note however that it didn\u2019t note the specific change in speed limit. I tried again this photograph with some more focused text prompts, but it could never say anything about the change in speed limit specific to this situation, rather providing general recommendations such as this:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"2ba0\">The maximum speed limit inside a tunnel can vary based on the country\u2019s traffic regulations, the specific road, and the tunnel\u2019s design. Generally, in many places, typical speed limits range from 50 to 80 km\/h (about 30 to 50 mph). To determine the exact speed limit for a specific tunnel, you would need to look for posted signs or consult local traffic regulations.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"ebab\">Which is correct but shows that it wasn\u2019t capable of extracting the information specifically available here, which would be something along the lines of \u201cMaximum speed is not indicated, but it is definitely above 50 according to the visible sign post.\u201d<\/p>\n<h2 class=\"wp-block-heading\" id=\"0a28\">3. Could Vision-Ready Models Drive Robots?<\/h2>\n<p class=\"wp-block-paragraph\" id=\"07fb\">If an AI model understands pictures well enough, one could then think about feeding a robot\u2019s camera feed to the model so that it can produce outputs that make sense. Of course this is a huge thing to study in itself, having to produce commands that end up in mechanical outputs; but let\u2019s see here how the core part \u2014 i.e. converting views to broad actions \u2014 would work.<\/p>\n<p class=\"wp-block-paragraph\" id=\"f263\">Given this photograph of a robotic arm on a table where a red and blue ball await to be grabbed:<\/p>\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/1_xLGJw_ijqE2s2f7mT26eJg.webp?ssl=1\" alt=\"\" class=\"wp-image-600103\"><\/figure>\n<p class=\"wp-block-paragraph\" id=\"d53b\">I ask GPT-4o the following:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"eb15\">How should the robotic arm move in order to grab the blue sphere?<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"3647\">To which the LLM replies quite correctly, even asking me if I want it to follow up with code generation:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"810c\">To grab the blue sphere, the robotic arm should follow these steps:<\/p>\n<p class=\"wp-block-paragraph\" id=\"2f41\"><strong>Position the Base<\/strong>: Rotate the base so that the arm aligns with the blue sphere.<\/p>\n<p class=\"wp-block-paragraph\" id=\"6e31\"><strong>Lower the Arm<\/strong>: Adjust the arm\u2019s height by moving its joints to bring the gripper closer to the sphere.<\/p>\n<p class=\"wp-block-paragraph\" id=\"3764\"><strong>Extend Forward<\/strong>: If necessary, extend the arm forward to reach the blue sphere.<\/p>\n<p class=\"wp-block-paragraph\" id=\"f4a9\"><strong>Open the Gripper<\/strong>: Ensure the gripper is open wide enough to grasp the sphere.<\/p>\n<p class=\"wp-block-paragraph\" id=\"d30a\"><strong>Close the Gripper<\/strong>: Gently close the gripper around the sphere to secure it.<\/p>\n<p class=\"wp-block-paragraph\" id=\"69e9\"><strong>Lift the Sphere<\/strong>: Raise the arm slightly while keeping the sphere firmly held.<\/p>\n<p class=\"wp-block-paragraph\" id=\"7eb5\"><strong>Retract or Move to Destination<\/strong>: Move the arm to a desired location for placing the sphere.<\/p>\n<p class=\"wp-block-paragraph\" id=\"8e42\">These movements should be smooth to avoid dropping or misaligning with the sphere. Would you like assistance with programming the robotic arm for this task?<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"583e\">I feel the answers are quite good, but lack precision. So, rather than going down the road of asking for code, I will interrogate the LLM a bit deeper to see what it is really understanding in this picture:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"a2d7\">When you say \u201cPosition the Base: Rotate the base so that the arm aligns with the blue sphere.\u201d should the arm move to its left or to its right, based on the image?<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"8bf6\">The answer turns out very good:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"0f86\">Based on the image, the robotic arm should\u00a0<strong>rotate to its left<\/strong>\u00a0to align with the blue sphere. The sphere is positioned to the left of the arm\u2019s current orientation, so rotating counterclockwise will bring the gripper closer to the sphere.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"5de5\">Yet I dig a bit deeper:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"344a\">How much do you estimate it should rotate?<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"94c7\">This new answer is essentially the same that I could estimate as a human from the photograph:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"e584\">Estimating from the image, the robotic arm should rotate its base\u00a0<strong>approximately 30 to 45 degrees counterclockwise<\/strong>\u00a0to align with the blue sphere.<\/p>\n<p class=\"wp-block-paragraph\" id=\"770d\">This estimate assumes that:<\/p>\n<p class=\"wp-block-paragraph\" id=\"7a78\">The arm\u2019s current position is facing slightly to the right.<\/p>\n<p class=\"wp-block-paragraph\" id=\"e182\">The blue sphere is placed in front and to the left of the arm.<\/p>\n<p class=\"wp-block-paragraph\" id=\"4509\">Once the base is rotated, minor adjustments to the arm joints may be needed to fine-tune the position before grasping the sphere.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"362c\">Finally, I ask about the red ball, to test if the model understands that it is on the other side and a smaller rotation is required:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"9564\">And how about grabbing the red ball?<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"5cf7\">And it does get it right!<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"0641\">To grab the\u00a0<strong>red sphere<\/strong>, the robotic arm should follow these steps:<\/p>\n<p class=\"wp-block-paragraph\" id=\"23ac\"><strong>Rotate the Base<\/strong>: The base should rotate\u00a0<strong>slightly to the left<\/strong>, but much less than for the blue sphere. An estimated\u00a0<strong>15 to 20 degrees counterclockwise<\/strong>\u00a0should align the arm with the red sphere.<\/p>\n<p class=\"wp-block-paragraph\" id=\"a512\"><strong>Lower the Arm<\/strong>: Bring the gripper down closer to the red sphere.<\/p>\n<p class=\"wp-block-paragraph\" id=\"d0d3\"><strong>Extend Forward (if needed)<\/strong>: Adjust the arm\u2019s reach if the gripper is not directly above the sphere.<\/p>\n<p class=\"wp-block-paragraph\" id=\"a841\"><strong>Open the Gripper<\/strong>: Ensure the gripper is wide enough to grab the sphere.<\/p>\n<p class=\"wp-block-paragraph\" id=\"614b\"><strong>Close the Gripper<\/strong>: Securely grasp the red sphere.<\/p>\n<p class=\"wp-block-paragraph\" id=\"0356\"><strong>Lift the Arm<\/strong>: Raise the sphere carefully.<\/p>\n<p class=\"wp-block-paragraph\" id=\"d8b9\"><strong>Move to Destination<\/strong>: Transport the sphere to a desired location if needed.<\/p>\n<p class=\"wp-block-paragraph\" id=\"be57\">Since the red sphere is closer to the robotic arm than the blue sphere, less movement is required. Would you like a precise angle calculation based on measurements?<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"66fd\">If you\u2019re also excited about the possibility of\u00a0<a href=\"https:\/\/x.com\/adcock_brett\/status\/1901303220300829066\" rel=\"noreferrer noopener\" target=\"_blank\">driving robots with vision models, check out this thread on X.com<\/a>.<\/p>\n<h2 class=\"wp-block-heading\" id=\"d32a\">4. Vision Models Acting on Data Plots<\/h2>\n<p class=\"wp-block-paragraph\" id=\"3013\">Being in\u00a0<em>Towards Data Science<\/em>, we can\u2019t escape the obvious question: can vision-assisted models understand the content of plots, perhaps even draw conclusions from them?<\/p>\n<p class=\"wp-block-paragraph\" id=\"3000\">I did quite some tests here, dealing with different kinds of information and plot types.<\/p>\n<h4 class=\"wp-block-heading\" id=\"e5e3\">Detecting high tides in a tide chart<\/h4>\n<p class=\"wp-block-paragraph\" id=\"1d5a\">I began with the following chart of tide highs and lows for one of the Azores islands, written in Portuguese but on which I asked a question in English:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"9762\">When are the peak highs in the tides according to the chart?<\/p>\n<\/blockquote>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" height=\"490\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/1_ZjILdOXSFFx14qHCStlHwg-1024x490.webp?resize=1024%2C490&#038;ssl=1\" alt=\"\" class=\"wp-image-600104\"><\/figure>\n<p class=\"wp-block-paragraph\" id=\"1529\">This was the answer:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"e5ed\">The peak high tides in the chart are marked by blue dots with corresponding times. The highest tides occur at the following times:<\/p>\n<p class=\"wp-block-paragraph\" id=\"6c35\"><strong>11 Aug:<\/strong>\u00a011:18<br \/><strong>12 Aug:<\/strong>\u00a012:15<br \/><strong>13 Aug:<\/strong>\u00a012:58<br \/><strong>14 Aug:<\/strong>\u00a013:33<br \/><strong>15 Aug:<\/strong>\u00a014:05<br \/><strong>16 Aug:<\/strong>\u00a014:35<br \/><strong>17 Aug:<\/strong>\u00a015:05<\/p>\n<p class=\"wp-block-paragraph\" id=\"eee5\">These times represent when the tide reaches its highest point each day.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"20ca\">You can immediately see that the displayed data is correct, but only represents high tides during the day and totally overlooks the high tides that happen during the day! Hence, I asked for that specifically, and got a much more complete answer:<\/p>\n<p class=\"wp-block-paragraph\" id=\"8bb6\">I said:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"476d\">But the times you say are all during the days\u2026 doesn\u2019t the chart show also peak high tides at night?<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"1bc8\">GPT-4o said:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"d3e8\">You\u2019re absolutely right! The chart also shows peak high tides occurring at night. Here are all the\u00a0<strong>peak high tide<\/strong>\u00a0times (both daytime and nighttime) according to the chart:<\/p>\n<p class=\"wp-block-paragraph\" id=\"b36e\"><strong>11 Aug:<\/strong>\u00a011:18 (Day)<br \/><strong>12 Aug:<\/strong>\u00a012:15 (Day) &amp; 23:49 (Night)<br \/><strong>13 Aug:<\/strong>\u00a012:58 (Day) &amp; 00:41 (Night)<br \/><strong>14 Aug:<\/strong>\u00a013:33 (Day) &amp; 01:20 (Night)<br \/><strong>15 Aug:<\/strong>\u00a014:05 (Day) &amp; 01:54 (Night)<br \/><strong>16 Aug:<\/strong>\u00a014:35 (Day) &amp; 02:24 (Night)<br \/><strong>17 Aug:<\/strong>\u00a015:05 (Day) &amp; 02:53 (Night)<\/p>\n<p class=\"wp-block-paragraph\" id=\"9448\">Each day has\u00a0<strong>two peak high tides<\/strong>, one during the day and one at night. Thanks for catching that!<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"e180\">That\u2019s now perfect. Finally, I asked one more piece of information:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"270c\">OK thanks. Can you say where on earth this chart is for?<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"faca\">And again the answer was right:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"5460\">Yes! The chart is for\u00a0<strong>Ponta Delgada, S\u00e3o Miguel Island<\/strong>, which is part of the\u00a0<strong>Azores<\/strong>, an autonomous region of\u00a0<strong>Portugal<\/strong>\u00a0in the North Atlantic Ocean. This location is known for its stunning volcanic landscapes, coastal views, and maritime activities. The tide chart shows the tidal schedule for this area from\u00a0<strong>August 11 to August 17<\/strong>.<\/p>\n<\/blockquote>\n<h4 class=\"wp-block-heading\" id=\"3014\">Interpreting a height profile from a photo<\/h4>\n<p class=\"wp-block-paragraph\" id=\"c8da\">Next I made the question a bit harder, as the image consisted of a photograph of a screen (see the interfering horizontal bands), the x axis of the plot is not complete, and the question involves some thinking:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"0842\">Looking at this, would I go up or down from Andermatt to Wassen? By how many meters and over how many km?<\/p>\n<\/blockquote>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" height=\"576\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/1_uVlWaFj0dsovGS3aXPZ2Zg-1024x576.webp?resize=1024%2C576&#038;ssl=1\" alt=\"\" class=\"wp-image-600105\"><\/figure>\n<p class=\"wp-block-paragraph\" id=\"02e1\">The answer was however perfect! Here it is:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"f4c5\">From the elevation profile, Andermatt is at\u00a0<strong>1437m<\/strong>, and Wassen is at\u00a0<strong>916m<\/strong>. This means that traveling from Andermatt to Wassen involves a\u00a0<strong>descent of 521 meters<\/strong>\u00a0(1437m \u2014 916m).<\/p>\n<p class=\"wp-block-paragraph\" id=\"a430\">Looking at the horizontal axis, the distance from Andermatt to Wassen appears to be\u00a0<strong>around 10 km<\/strong>. So, you would be going\u00a0<strong>downhill by 521 meters over approximately 10 km<\/strong>.<\/p>\n<\/blockquote>\n<h4 class=\"wp-block-heading\" id=\"3182\">Interpreting plots from RNA-seq data<\/h4>\n<p class=\"wp-block-paragraph\" id=\"afb3\">I then tested GPT-4o\u2019s capability to interpret plots produced as part of a bioinformatics pipeline. First, I made it process a volcano plot, which is one of the most common kinds of graphs used to analyze differential gene expression data. I took an example image from\u00a0<a href=\"https:\/\/scienceparkstudygroup.github.io\/rna-seq-lesson\/06-differential-analysis\/index.html\" rel=\"noreferrer noopener\" target=\"_blank\">this free online tutorial<\/a>, and asked the system the following:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"b11c\">You are a bioinformatics expert. What would you conclude from this volcano plot? I mean, which genes are up- or down-regulated?<\/p>\n<\/blockquote>\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/0_fVSTG_CvPHMB8Vc-.webp?ssl=1\" alt=\"\" class=\"wp-image-600106\"><\/figure>\n<p class=\"wp-block-paragraph\" id=\"efd8\">This was GPT-4o\u2019s answer:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"de81\">This volcano plot represents differentially expressed genes in\u00a0<em>Arabidopsis thaliana<\/em>\u00a0infected with\u00a0<em>Pseudomonas syringae DC3000<\/em>\u00a0compared to a mock treatment. Here\u2019s my interpretation:<\/p>\n<p class=\"wp-block-paragraph\" id=\"7853\"><strong>1. X-axis (Log\u2082 fold change)<\/strong>:<br \/>Positive values indicate\u00a0<strong>up-regulated genes<\/strong>\u00a0(higher expression in infected samples).<br \/>Negative values indicate\u00a0<strong>down-regulated genes<\/strong>\u00a0(lower expression in infected samples).<\/p>\n<p class=\"wp-block-paragraph\" id=\"28d6\"><strong>2. Y-axis (-Log\u2081\u2080 P-value)<\/strong>:<br \/>Higher values indicate stronger statistical significance.<\/p>\n<p class=\"wp-block-paragraph\" id=\"6a91\"><strong>3. Color coding<\/strong>:<br \/><strong>Gray<\/strong>: Not significant.<br \/><strong>Blue<\/strong>: Significant by p-value but not by fold change.<br \/><strong>Red<\/strong>: Significant by both p-value and fold change threshold.<\/p>\n<p class=\"wp-block-paragraph\" id=\"c408\"><strong>Key findings:<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"73d9\"><strong>Up-regulated genes (right side, red dots)<\/strong>:<br \/>Strongly induced genes include\u00a0<strong>AT5G42800, AT1G52440, AT1G73325, AT4G18440<\/strong>, among others.<br \/>These genes are highly expressed in infected plants.<\/p>\n<p class=\"wp-block-paragraph\" id=\"ef3d\"><strong>Down-regulated genes (left side, red dots)<\/strong>:<br \/>Examples include\u00a0<strong>AT4G28940, AT5G04100, AT1G04220, AT1G02440<\/strong>.<br \/>These genes are significantly suppressed during infection.<\/p>\n<p class=\"wp-block-paragraph\" id=\"c880\"><strong>Biological Implications:<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"b03f\">Up-regulated genes may be involved in plant immune responses, such as\u00a0<strong>defense signaling, stress response, or antimicrobial activity<\/strong>.<\/p>\n<p class=\"wp-block-paragraph\" id=\"5aeb\">Down-regulated genes might be involved in\u00a0<strong>normal cellular functions that get suppressed during infection<\/strong>.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"d322\">You can see how the first part of the response clearly reads our the main features of the plot, explaining what the axes and label colors are, and even considering the plot\u2019s title. Then, the model distills lists of genes that are up- or down-regulated passing the thresholds with strong statistical significance. Note that the list is correct for the up-regulated genes but has a few problems with the down-regulated ones. On inspection of the plot it becomes clear that this is due to the labels not being clearly resolved, even for humans.<\/p>\n<p class=\"wp-block-paragraph\" id=\"3496\">One gear up, somehow the model knows correct information about the identified genes, so it can draw some simple conclusions about the biological significance of the results.<\/p>\n<p class=\"wp-block-paragraph\" id=\"779d\">Truly amazing, if it were not just for that small problem on reading the labels, which in a real scenario could probably be improved.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dotted\">\n<p class=\"wp-block-paragraph\" id=\"788f\">Then in a second test related to RNA-seq data, I asked GPT-4o what it thought about two distributions of p-values shown together in a single graph (image\u00a0<a href=\"https:\/\/www.biostars.org\/p\/346323\/\" rel=\"noreferrer noopener\" target=\"_blank\">edited from this tutorial<\/a>):<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"fac4\">This figure shows two distributions of p-values from RNA-seq data. Do you think they are OK, or do you notice anything weird?<\/p>\n<\/blockquote>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" height=\"528\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/1_jRmV2aVdeEChZljuxsqMPQ-1024x528.webp?resize=1024%2C528&#038;ssl=1\" alt=\"\" class=\"wp-image-600107\"><\/figure>\n<p class=\"wp-block-paragraph\" id=\"5b4e\">This was the answer, again excellent:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"6672\">This histogram represents two distributions of p-values from RNA-seq data, likely comparing two different sets of statistical tests. Here\u2019s my assessment:<\/p>\n<p class=\"wp-block-paragraph\" id=\"800a\"><strong>* What looks normal?<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"bc40\"><strong>Uniform distribution of non-significant p-values<\/strong>:<\/p>\n<p class=\"wp-block-paragraph\" id=\"df8b\">The light blue bars seem to follow a relatively uniform distribution across the range, which is expected under the null hypothesis (genes with no real differential expression should have p-values uniformly distributed between 0 and 1).<\/p>\n<p class=\"wp-block-paragraph\" id=\"a7b4\"><strong>Enrichment of low p-values<\/strong>:<\/p>\n<p class=\"wp-block-paragraph\" id=\"179c\">The peak near\u00a0<strong>p = 0<\/strong>\u00a0suggests that a subset of genes show strong statistical significance, which is expected when there are truly differentially expressed genes.<\/p>\n<p class=\"wp-block-paragraph\" id=\"dcd0\"><strong>* What looks weird?<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"142e\"><strong>Excessive spike at p = 0<\/strong>:<\/p>\n<p class=\"wp-block-paragraph\" id=\"9fce\">If there is an unnaturally large spike at\u00a0<strong>exactly 0<\/strong>, this could indicate a bias or an issue in the statistical testing (e.g., overfitting, batch effects, or miscalculated p-values).<\/p>\n<p class=\"wp-block-paragraph\" id=\"7de0\"><strong>Bimodal distribution<\/strong>:<\/p>\n<p class=\"wp-block-paragraph\" id=\"c293\">The yellow distribution appears to have multiple peaks, which is unusual. Normally, p-value distributions should either be uniform (under the null hypothesis) or show a single peak at\u00a0<strong>low p-values<\/strong>\u00a0(for differentially expressed genes).<\/p>\n<p class=\"wp-block-paragraph\" id=\"c845\">Peaks at intermediate p-values might suggest issues with normalization, batch effects, or improper multiple-testing corrections.<\/p>\n<p class=\"wp-block-paragraph\" id=\"f5de\"><strong>* Potential issues to check<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"638f\"><strong>Overdispersion or poor model fit<\/strong>: Are the statistical methods used properly accounting for biological variability?<\/p>\n<p class=\"wp-block-paragraph\" id=\"b95a\"><strong>Multiple testing correction issues<\/strong>: If false discovery rate (FDR) correction wasn\u2019t applied correctly, the distribution could be skewed.<\/p>\n<p class=\"wp-block-paragraph\" id=\"9815\"><strong>Batch effects or confounding variables<\/strong>: If the data contains hidden covariates, it can introduce unexpected structure in the p-value distribution.<\/p>\n<\/blockquote>\n<h2 class=\"wp-block-heading\" id=\"e7f8\">5. Analysis of Screenshots Containing Molecular Structures<\/h2>\n<p class=\"wp-block-paragraph\" id=\"1914\">If you follow me you know that I work on various molecular sciences; then, of course I had to try out these vision-enhanced models with such kind of inputs.<\/p>\n<h3 class=\"wp-block-heading\" id=\"387b\">Help using a program to generate 3D structures from 2D drawings<\/h3>\n<p class=\"wp-block-paragraph\" id=\"c82b\">First I checked GPT-4o-mini\u2019s capability to help a user who wants to create a molecule in 3D format from a drawing, who knows that this should be possible with Hack-A-Mol but doesn\u2019t know exactly how to use it.<\/p>\n<p class=\"wp-block-paragraph\" id=\"5c27\">By the way, I covered Hack-A-Mol here:<\/p>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/medium.com\/advances-in-biological-science\/purely-web-based-suites-for-manipulation-of-bio-molecular-data-cbb7fdd60df4\">https:\/\/medium.com\/advances-in-biological-science\/purely-web-based-suites-for-manipulation-of-bio-molecular-data-cbb7fdd60df4<\/a><\/p>\n<p class=\"wp-block-paragraph\">Now let\u2019s see how well GPT-4o can help one to interpret what one obtains in this program. I uploaded this screenshot:<\/p>\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/0_jwrtjSVKvAStQnNi.webp?ssl=1\" alt=\"\" class=\"wp-image-600108\"><\/figure>\n<p class=\"wp-block-paragraph\" id=\"b08a\">And asked this question:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"8f3f\">I\u2019m trying to build a small molecule using this tool, by drawing and then clicking the arrow to the right\u2026. but I don\u2019t get very well what\u2019s being generated exactly\u2026 can you explain?<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"f962\">This was the answer:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"34e2\">Yes! When you draw your small molecule and click the arrow, several things are generated:<\/p>\n<p class=\"wp-block-paragraph\" id=\"1aa8\"><strong>2D Structure (Left Panel)<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"4e65\">This is your drawn molecule in\u00a0<strong>2D<\/strong>, using standard chemical notation.<\/p>\n<p class=\"wp-block-paragraph\" id=\"ccdc\"><strong>3D Structure (Right Panel \u2014 JSmol Viewer)<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"51e0\">This is the\u00a0<strong>3D molecular model<\/strong>\u00a0generated from your 2D drawing.<\/p>\n<p class=\"wp-block-paragraph\" id=\"8939\">It gives you a spatial representation of the molecule.<\/p>\n<p class=\"wp-block-paragraph\" id=\"8efc\"><strong>Molecular Identifiers (Below the 2D Editor)<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"2621\"><strong>InChI (IUPAC International Chemical Identifier)<\/strong>: A text-based representation of the molecular structure.<\/p>\n<p class=\"wp-block-paragraph\" id=\"009d\"><strong>InChIKey<\/strong>: A hashed, fixed-length version of the InChI, useful for searching databases.<\/p>\n<p class=\"wp-block-paragraph\" id=\"d10a\"><strong>SMILES (Simplified Molecular Input Line Entry System)<\/strong>: A compact string representation of the molecule.<\/p>\n<p class=\"wp-block-paragraph\" id=\"60a6\"><strong>Structural Data (Bottom Text Box)<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"d730\">You see data in\u00a0<strong>MOL\/SDF format<\/strong>, which contains atomic coordinates, bonding information, and metadata about the molecule.<\/p>\n<p class=\"wp-block-paragraph\" id=\"295c\">Other formats available include\u00a0<strong>XYZ, PDB, and CIF<\/strong>, which are useful for computational chemistry and molecular modeling.<\/p>\n<p class=\"wp-block-paragraph\" id=\"c70b\">What Can You Do with This?<\/p>\n<p class=\"wp-block-paragraph\" id=\"b8c2\">If you want to\u00a0<strong>simulate the molecule or use it in software like AlphaFold 3<\/strong>, you should\u00a0<strong>export the structure in SDF or PDB format<\/strong>.<\/p>\n<p class=\"wp-block-paragraph\" id=\"6c06\">If you want to\u00a0<strong>search for the molecule online<\/strong>, use the\u00a0<strong>InChIKey or SMILES<\/strong>.<\/p>\n<\/blockquote>\n<h3 class=\"wp-block-heading\" id=\"c849\">Using GPT-4o with vision to help understand AlphaFold models<\/h3>\n<p class=\"wp-block-paragraph\" id=\"2806\">In the next example, I tested how well GPT-4o can understand structural models of proteins produced by AlphaFold (in this case AlphaFold 2 as utilized to build the AlphaFold-EBI database of models). By the way, if you have no idea what AlphaFold is, check out these posts:<\/p>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/medium.com\/advances-in-biological-science\/alphafold-3s-advances-and-limitations-in-a-nutshell-5ae0cdd814ea\">https:\/\/medium.com\/advances-in-biological-science\/alphafold-3s-advances-and-limitations-in-a-nutshell-5ae0cdd814ea<\/a><\/p>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/lucianosphere.medium.com\/guide-to-my-blog-articles-on-alphafold-39c10a7dad7f\">https:\/\/lucianosphere.medium.com\/guide-to-my-blog-articles-on-alphafold-39c10a7dad7f<\/a><\/p>\n<p class=\"wp-block-paragraph\">It is well-known that the AlphaFold 2 and 3 systems revolutionized modeling; however, of course they aren\u2019t infallible. To help users understand what\u2019s likely correct and wrong in a structural model, AlphaFold returns not just the atomic coordinates that constitute the structural model but also quality metrics, of which the most important ones are the residue-wise pLDDTs and the inter-residue PAEs. We here tested\u00a0<a href=\"https:\/\/alphafold.ebi.ac.uk\/entry\/Q12XU1\" target=\"_blank\" rel=\"noreferrer noopener\">GPT-4o with vision capabilities on this Uniprot entry precomputed at the AF-EBI database<\/a>:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" height=\"474\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/1_hjMav3xo095K7YU99G9tKg-1024x474.webp?resize=1024%2C474&#038;ssl=1\" alt=\"\" class=\"wp-image-600109\"><\/figure>\n<p class=\"wp-block-paragraph\" id=\"5fcd\">The text part of the prompt was:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"6585\">You are an expert in bioinformatics. I\u2019m checking an entry at the AF-EBI database of protein structure models, and I got what I\u2019m showing in the attached screenshot. What do you think? Is the model good? I\u2019m especially interested in the relative orientation of the two domains, but I\u2019m not sure what to conclude from the PAE plot\u2026<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"2526\">And this was the answer:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"ae7a\">From your screenshot, we can analyze the quality of the model based on the\u00a0<strong>pLDDT confidence scores<\/strong>\u00a0and the\u00a0<strong>Predicted Aligned Error (PAE) plot<\/strong>:<\/p>\n<p class=\"wp-block-paragraph\" id=\"870f\"><strong>Model Confidence (pLDDT)<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"d6e6\">The\u00a0<strong>Mol*<\/strong>\u00a03D viewer colors the model based on confidence scores.<\/p>\n<p class=\"wp-block-paragraph\" id=\"4997\">Most of the structure is colored in\u00a0<strong>blue (high to very high confidence, pLDDT &gt; 70\u201390)<\/strong>, suggesting that the individual domains are well-predicted.<\/p>\n<p class=\"wp-block-paragraph\" id=\"9b3a\">However, some\u00a0<strong>loops and termini (orange\/red, pLDDT &lt; 50\u201370)<\/strong>\u00a0have lower confidence, which is expected for flexible regions.<\/p>\n<p class=\"wp-block-paragraph\" id=\"7d66\"><strong>PAE Plot Analysis (Domain Orientation)<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"0afe\">The PAE plot (right panel) indicates how well the relative positions of different residues are predicted.<\/p>\n<p class=\"wp-block-paragraph\" id=\"1a68\">A\u00a0<strong>low PAE (dark green)<\/strong>\u00a0means the relative position between two residues is well-defined.<\/p>\n<p class=\"wp-block-paragraph\" id=\"aa29\">A\u00a0<strong>high PAE (lighter green\/white areas)<\/strong>\u00a0suggests uncertainty in the relative positioning.<\/p>\n<p class=\"wp-block-paragraph\" id=\"2993\"><strong>Interpreting the PAE<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"d109\">The diagonal (self-alignment) is always well-defined.<\/p>\n<p class=\"wp-block-paragraph\" id=\"14c3\">The block-like structure in the PAE plot suggests\u00a0<strong>two or more well-defined domains<\/strong>\u00a0with high internal confidence.<\/p>\n<p class=\"wp-block-paragraph\" id=\"51ed\">However,\u00a0<strong>lighter green areas between the blocks indicate uncertainty in the relative orientation of these domains<\/strong>.<\/p>\n<p class=\"wp-block-paragraph\" id=\"0ad7\">If your interest is in the relative orientation of the domains, this suggests\u00a0<strong>the model may not be reliable in that aspect<\/strong>\u00a0\u2014 it might have multiple plausible conformations.<\/p>\n<p class=\"wp-block-paragraph\" id=\"23da\"><strong>Conclusion<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"ec89\">The\u00a0<strong>individual domains are well-modeled<\/strong>, but their\u00a0<strong>relative orientation might be uncertain<\/strong>.<\/p>\n<p class=\"wp-block-paragraph\" id=\"6cca\">If functionally relevant, you might consider\u00a0<strong>experimental validation (e.g., cryo-EM, SAXS, or NMR)<\/strong>\u00a0or\u00a0<strong>molecular dynamics simulations<\/strong>\u00a0to explore domain flexibility.<\/p>\n<p class=\"wp-block-paragraph\" id=\"6b2e\">If you need a more rigid structure, you could look for\u00a0<strong>homologous experimental structures<\/strong>\u00a0or use\u00a0<strong>AlphaFold-Multimer<\/strong>\u00a0if interactions between chains matter.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"db1f\">The answer was pretty neat and correct. However, we realized that we might have confused the model by talking about two domains, when the screenshot actually shows three. We then asked a folow-up question:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"cd8e\">Now checking carefully I see the TED Domains thing shows 3 domains\u2026 am I right that according to the PAE plot the arrangement of the first two domains is confident, but not so much their arrangement relative to the third one?<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"e7fd\">The answer, based on an inspection of the PAE plot considering now that there are indeed 3 domains, the first two being confidently modeled relative to each other and the third one only modeled confidently locally, is again perfect:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"7177\">Yes, your interpretation is correct! The\u00a0<strong>TED Domains annotation<\/strong>\u00a0indicates that the model has\u00a0<strong>three domains<\/strong>, as shown by the color-coded segments at the top of the PAE plot.<\/p>\n<p class=\"wp-block-paragraph\" id=\"2ed9\">Looking at the\u00a0<strong>PAE plot<\/strong>:<\/p>\n<p class=\"wp-block-paragraph\" id=\"44b9\">The\u00a0<strong>first two domains<\/strong>\u00a0(green and orange segments) have\u00a0<strong>low PAE values between them<\/strong>\u00a0(darker green in the top-left block of the matrix). This suggests that their\u00a0<strong>relative orientation is well-defined and confident<\/strong>.<\/p>\n<p class=\"wp-block-paragraph\" id=\"4a0e\">However, the\u00a0<strong>third domain<\/strong>\u00a0(purple segment) shows\u00a0<strong>higher PAE values relative to the first two<\/strong>\u00a0(lighter green\/white regions in the lower-right part of the matrix). This means that the\u00a0<strong>relative position of the third domain is uncertain<\/strong>\u00a0\u2014 it may adopt multiple conformations.<\/p>\n<p class=\"wp-block-paragraph\" id=\"d953\"><strong>What This Means for Your Analysis<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"5df4\">The\u00a0<strong>first two domains likely form a rigid unit<\/strong>\u00a0with a stable interaction.<\/p>\n<p class=\"wp-block-paragraph\" id=\"ed2f\">The\u00a0<strong>third domain may be more flexible<\/strong>\u00a0relative to the first two, possibly moving independently or being less constrained in solution.<\/p>\n<p class=\"wp-block-paragraph\" id=\"ad22\">If this flexibility is functionally important (e.g., for ligand binding, protein-protein interactions, or conformational switching), experimental validation like\u00a0<strong>SAXS, cryo-EM, or FRET<\/strong>\u00a0could be useful.<\/p>\n<\/blockquote>\n<h3 class=\"wp-block-heading\" id=\"85f3\">Visual analysis of protein-ligand complexes<\/h3>\n<p class=\"wp-block-paragraph\" id=\"beb9\">Last, I geared up to challenge GPT-4o with a more difficult problem. Given a 2D snapshot from a molecular graphics program showing a ligand bound to a protein, can it infer what kinds of contacts the two molecules establish?<\/p>\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/1_6JfW6GnAg-a0HhqEuMDXpA.webp?ssl=1\" alt=\"\" class=\"wp-image-600110\"><\/figure>\n<p class=\"wp-block-paragraph\" id=\"8a71\">The question was:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"2251\">Here\u2019s a screenshot of ibuprofen docked inside a pocket in BSA. What interactions do you see between the ligand and the protein? I want you to tell me exactly what interactions you see in this structure from the screenshot provided.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"46e3\">And this was the answer:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"13a1\">From the screenshot you provided, I can visually infer potential interactions, but without residue labels or a precise atomic distance measurement, I can only make educated observations. Here\u2019s what I\u00a0<strong>specifically see<\/strong>\u00a0in your image:<\/p>\n<p class=\"wp-block-paragraph\" id=\"4e64\"><strong>1. Hydrophobic Interactions:<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"4abd\">Ibuprofen (highlighted in yellow) is embedded in a\u00a0<strong>hydrophobic pocket<\/strong>\u00a0of the protein.<\/p>\n<p class=\"wp-block-paragraph\" id=\"a1d5\">It appears to be surrounded by\u00a0<strong>gray carbon atoms<\/strong>\u00a0from amino acid side chains, likely hydrophobic residues (e.g., leucine, valine, phenylalanine, or isoleucine).<\/p>\n<p class=\"wp-block-paragraph\" id=\"fc6b\">The\u00a0<strong>benzene ring of ibuprofen<\/strong>\u00a0seems to be in close contact with other\u00a0<strong>aromatic or aliphatic side chains<\/strong>, suggesting\u00a0<strong>\u03c0-\u03c0 stacking or van der Waals interactions<\/strong>.<\/p>\n<p class=\"wp-block-paragraph\" id=\"8757\"><strong>2. Hydrogen Bonds:<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"3166\">The\u00a0<strong>carboxyl (-COOH) group of ibuprofen<\/strong>\u00a0appears to be interacting with nearby\u00a0<strong>blue (nitrogen) or red (oxygen) atoms<\/strong>\u00a0from the protein.<\/p>\n<p class=\"wp-block-paragraph\" id=\"f5b2\">There is\u00a0<strong>at least one close contact between the oxygen of the carboxyl group and a protein residue, likely forming a hydrogen bond<\/strong>\u00a0with a backbone amide or side chain hydroxyl (e.g., Ser, Thr, Tyr, Asp, or Glu).<\/p>\n<p class=\"wp-block-paragraph\" id=\"3fc4\"><strong>3. Possible Electrostatic Interactions:<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"8160\">Ibuprofen is negatively charged at physiological pH (<strong>carboxylate anion, -COO\u207b<\/strong>).<\/p>\n<p class=\"wp-block-paragraph\" id=\"c0d9\">There may be\u00a0<strong>a salt bridge<\/strong>\u00a0or\u00a0<strong>ion-dipole interaction<\/strong>\u00a0if a\u00a0<strong>positively charged residue (e.g., Arg, Lys, or His) is nearby<\/strong>, but I cannot confirm the exact residue from the image alone.<\/p>\n<p class=\"wp-block-paragraph\" id=\"9827\"><strong>What I Cannot Determine from the Screenshot Alone:<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"5a91\"><strong>Precise interacting residues:<\/strong>\u00a0No residue labels are visible.<\/p>\n<p class=\"wp-block-paragraph\" id=\"df2e\"><strong>Exact interaction distances:<\/strong>\u00a0Hydrogen bonds and electrostatic interactions typically require precise atomic distances (~2\u20133 \u00c5).<\/p>\n<p class=\"wp-block-paragraph\" id=\"1e2b\"><strong>Interaction strengths:<\/strong>\u00a0Molecular modeling software would be needed to confirm interaction energies.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"d3d8\">The answer was quite good, especially considering that the picture is just a static 2D view and is quite crowded in atoms. In fact, it correctly pinpoints that the ibuprofen is shown in yellow, and highlights hydrophobic and aromatic contacts around its aromatic ring, both of which do happen (#1 in the figure below), and hydrogen bonds from its carboxylate group to nearby N and O atoms (#2 in the figure), adding that there might also possibly exist a salt bridge between that same carboxylate and some positively charged group of the protein that it can\u2019t distinguish clearly (#3).<\/p>\n<p class=\"wp-block-paragraph\" id=\"80a1\">For your reference, I show these interactions in the next figure, numbered 1, 2 and 3 as they were mentioned:<\/p>\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/1_jhkx-RsffPPtE3k3nVsB0Q.webp?ssl=1\" alt=\"\" class=\"wp-image-600111\"><\/figure>\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dotted\">\n<h2 class=\"wp-block-heading\" id=\"250a\">Other Vision-Capable Models? Peeking Into Google\u2019s Gemini 2.0 Flash<\/h2>\n<p class=\"wp-block-paragraph\" id=\"4046\">OpenAI is probably the most famous developer of LLM-based AI systems, but of course not the only one, and their models aren\u2019t even the best for some goals and according to some evaluations. However, they remain, to me at least, the easiest to use both directly in a GUI (ChatGPT) and programmatically (as we have seen here and I\u2019ve shown in many other posts).<\/p>\n<p class=\"wp-block-paragraph\" id=\"d884\">But my point here is that we can\u2019t overlook the other models. And just while preparing this post, I decided to give a try at Google\u2019s AI studio\u2019s Gemini 2.0 Flash, which like GPT-4o has picture, audio and document processing capabilities \u2014 and even video!<\/p>\n<p class=\"wp-block-paragraph\" id=\"674f\">Again I was amazed, and in fact it seemed to perform slightly better than GPT-4o and GPT-4o-mini tried throughout this post. I was in particular stunned at this specific example, where I loaded the screenshot of my app shown above, to which Gemini replied (without any prompt):<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"afc6\">Here\u2019s what I can infer based on the image you provided:<\/p>\n<p class=\"wp-block-paragraph\" id=\"53d9\"><strong>Prompt:<\/strong>\u00a0\u201cWhat do you think the artist\u201d<\/p>\n<p class=\"wp-block-paragraph\" id=\"3b93\"><strong>Analysis:<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"523b\">The response provided at the bottom of the image states \u201cThe artist likely intended to convey themes of nature, beauty, and vibrancy through the striking imagery of the red poppies against a lush green background.\u201d<\/p>\n<p class=\"wp-block-paragraph\" id=\"4cc7\">Essentially, the AI is recognizing the subject matter and interpreting the artist\u2019s intent as showcasing the beauty and vibrancy of nature.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"d5b0\">That was kind of similar to what GPT-4o replied, probably because it just read what GPT-4o itself wrote. But what if I ask it a question trying to address the same point I asked GPT-4o?<\/p>\n<p class=\"wp-block-paragraph\" id=\"697e\">Here it goes:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"1597\">do your own interpretation of the picture\u2026 what do you think did the author intend to communicate?<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"dd7d\">And this was Gemini\u2019s reply, now perfectly reflecting what I think the author intended to convey:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\" id=\"37b6\">Okay, based on the image, here\u2019s my interpretation of what the author might have intended:<\/p>\n<p class=\"wp-block-paragraph\" id=\"9e44\">Beyond the surface-level beauty, the image hints at a more complex exploration of color harmony and contrast within nature.<\/p>\n<p class=\"wp-block-paragraph\" id=\"d156\"><strong>The juxtaposition of the natural poppy field with the color wheel suggests a deliberate artistic statement.<\/strong>\u00a0The artist may be inviting us to analyze the colors present in the natural scene (dominated by greens and reds) and to understand how these colors interact and create a specific visual experience.<\/p>\n<p class=\"wp-block-paragraph\" id=\"ca0d\"><strong>The placement of the color wheel with the arrows pointing to the green and red suggest complementary colours.<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"3744\"><strong>Perhaps the author wants us to appreciate the inherent color palettes nature provides.<\/strong><\/p>\n<p class=\"wp-block-paragraph\" id=\"9a43\">In short, while the surface reading is about beauty, the integration of the color wheel suggests a deeper interest in the science and theory of color as applied to the natural world. The author seems to be inviting a more analytical and informed way of appreciating the beauty we see.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\" id=\"59ef\">The key part of Gemini\u2019s reply, which GPT-4o didn\u2019t get in any of my tests, is this:<\/p>\n<p class=\"wp-block-paragraph\" id=\"0717\">\u201cThe placement of the color wheel with the arrows pointing to the green and red suggest complementary colours.\u201d<\/p>\n<p class=\"wp-block-paragraph\" id=\"d8a3\">That was exactly my point as an artist!<\/p>\n<p class=\"wp-block-paragraph\" id=\"a609\">If this post has good reception, I will test Gemini 2.0\u2019s various capabilities in more detail. So let me know in the comments what you think!<\/p>\n<h2 class=\"wp-block-heading\" id=\"29eb\">Conclusions<\/h2>\n<p class=\"wp-block-paragraph\" id=\"b211\">I guess after having read all these examples, you now also admire the potential of multimodal, vision-capable AI systems. The tests I presented with the GPT-4o models (and the glimpse into Gemini 2.0) proved the ability of the most modern AI systems to interpret quite complex images, being capable of understanding and extracting information from photographs, scientific figures of various kinds, plots and charts, even when containing information as complex as molecular structures. And from the examples we can advance applications as varied as assisting data analysis, assisting driving, and controlling robots \u2014 all of them provided some problems are overcome.<\/p>\n<p class=\"wp-block-paragraph\" id=\"18da\">This, because some images and questions are quite challenging, of course. In some cases, for example when limited by the resolution of labels in dense plots, or when limited by the lack of 3D perspective in the flat screenshots of molecular structures, or when having to estimate rotation angles for the robotic arm, there\u2019s probably not much to do, and the model remains as limited as even the most expert humans in the subject would.<\/p>\n<p class=\"wp-block-paragraph\" id=\"10df\">Yes, it is overall clear that with proper prompting and continued advancements in AI, these models could become invaluable tools for accelerating data interpretation and decision-making, reducing the load of human experts who can dedicate to more complex problems while the software assists non-experts to interpret graphical outputs from software, and who knows maybe some day drive cars and control robots!<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dotted\">\n<p class=\"wp-block-paragraph\" id=\"5793\"><a href=\"https:\/\/www.lucianoabriata.com\/\" rel=\"noreferrer noopener\" target=\"_blank\"><em>www.lucianoabriata.com<\/em><\/a><em>\u00a0I write about everything that lies in my broad sphere of interests: nature, science, technology, programming, etc.\u00a0<\/em><a href=\"https:\/\/lucianosphere.medium.com\/subscribe\"><em>Subscribe to get my new stories<\/em><\/a><em>\u00a0by email. To consult about small jobs check my\u00a0<\/em><a href=\"https:\/\/lucianoabriata.altervista.org\/services\/index.html\" rel=\"noreferrer noopener\" target=\"_blank\"><em>services page here<\/em><\/a><em>. You can\u00a0<\/em><a href=\"https:\/\/lucianoabriata.altervista.org\/office\/contact.html\" rel=\"noreferrer noopener\" target=\"_blank\"><em>contact me here<\/em><\/a><em>. You can\u00a0<\/em><a href=\"https:\/\/paypal.me\/LAbriata\" rel=\"noreferrer noopener\" target=\"_blank\"><em>tip me here<\/em><\/a><em>.<\/em><\/p>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/medium.com\/advances-in-biological-science\/purely-web-based-suites-for-manipulation-of-bio-molecular-data-cbb7fdd60df4?source=post_page-----16fe6f76c22f---------------------------------------\" rel=\"noreferrer noopener\" target=\"_blank\"><\/a><\/p>\n<p>The post <a href=\"https:\/\/towardsdatascience.com\/testing-the-power-of-multimodal-ai-systems-in-reading-and-interpreting-photographs-maps-charts-and-more\/\">Testing the Power of Multimodal AI Systems in Reading and Interpreting Photographs, Maps, Charts and\u00a0More<\/a> appeared first on <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a>.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Luciano Abriata<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/testing-the-power-of-multimodal-ai-systems-in-reading-and-interpreting-photographs-maps-charts-and-more\/\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Testing the Power of Multimodal AI Systems in Reading and Interpreting Photographs, Maps, Charts and\u00a0More Introduction It\u2019s no news that artificial intelligence has made huge strides in recent years, particularly with the advent of multimodal models that can process and create both text and images, and some very new ones that also process and produce [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2117,62,221,67,71,70,2119],"tags":[1163,73,1907],"class_list":["post-2636","post","type-post","status-publish","format-standard","hentry","category-ai-systems","category-aimldsaimlds","category-computer-vision","category-deep-dives","category-large-language-models","category-machine-learning","category-multimodal-learning","tag-gpt","tag-models","tag-o"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/2636"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=2636"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/2636\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=2636"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=2636"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=2636"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}