{"id":448,"date":"2024-05-31T03:21:11","date_gmt":"2024-05-30T21:51:11","guid":{"rendered":"https:\/\/guganeshan.com\/blog\/?p=448"},"modified":"2024-05-31T02:33:47","modified_gmt":"2024-05-30T21:03:47","slug":"i-found-a-simple-task-where-generative-ai-fails","status":"publish","type":"post","link":"https:\/\/guganeshan.com\/blog\/i-found-a-simple-task-where-generative-ai-fails.html","title":{"rendered":"I found a simple task where Generative AI fails!"},"content":{"rendered":"\n<p>I made <a href=\"https:\/\/openai.com\/index\/hello-gpt-4o\/\" target=\"_blank\" rel=\"noopener\" title=\"\">ChatGPT-4o<\/a> go into a loop while trying to come up with an answer to my question yesterday. <a href=\"https:\/\/blog.google\/products\/gemini\/bard-gemini-advanced-app\/\" target=\"_blank\" rel=\"noopener\" title=\"\">Gemini Advanced<\/a> failed too. It was surprising to me, to think that these very capable Generative AI applications can fail in a task that is relatively simple to us humans.<\/p>\n\n\n\n<p>But today, it looks like ChatGPT learned from its mistakes \ud83d\udc4f<\/p>\n\n\n\n<p>If you also want to try this, here are the details \ud83d\udc47 <\/p>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Background (Baby names! \ud83d\udc76)<\/h5>\n\n\n\n<p>A friend wanted help with finding a good baby name. My search started simply in Google, as usual. Then, I came across an article about numerology and how each character in the alphabet is assigned a specific number, allowing us to calculate a total for a baby name that is compatible (the details are irrelevant for this blog post).<\/p>\n\n\n\n<p>These were the numbers for characters:<\/p>\n\n\n\n<figure class=\"wp-block-image size-medium\"><a href=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/image.png\"><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"219\" src=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/image-300x219.png\" alt=\"Values for each character of the alphabet in Chaldean numerology\" class=\"wp-image-450\" srcset=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/image-300x219.png 300w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/image.png 581w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/figure>\n\n\n\n<p>So, for example, if the name is &#8220;John&#8221;, it would be:<\/p>\n\n\n\n<p>J = 1, O = 7, H = 5, N = 5.<\/p>\n\n\n\n<p>So, 1 + 7 + 5 + 5 = 18<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\">My idea \ud83d\udca1<\/h5>\n\n\n\n<p>I was curious to use AI, to come up with baby names whose total (according to the above chart) is a specific number, just to see how easy it is. After all, these Generative AI applications can code fully functional games before you can even finish your coffee.<\/p>\n\n\n\n<p>If a human was given this task, we are going to:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Collect a few names we like<\/li>\n\n\n\n<li>Calculate each name to see if they match the required total<\/li>\n\n\n\n<li>If matches, add it to a &#8216;selected names&#8217; list<\/li>\n\n\n\n<li>If not, we would try to modify the name a bit to confirm to the total<\/li>\n\n\n\n<li>If that fails, we will move to other names<\/li>\n\n\n\n<li>Repeat until we have the required number of names<\/li>\n<\/ul>\n\n\n\n<p>It&#8217;s as simple as that.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\">First step \u2705<\/h5>\n\n\n\n<p>As a first step, I provided both ChatGPT and Gemini the characters &amp; values in the chart above and explained the situation.<\/p>\n\n\n\n<p>Then, to make sure they got it, I gave a name and asked for the total. Here&#8217;s the Gemini screenshot:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/1.png\"><img loading=\"lazy\" decoding=\"async\" width=\"842\" height=\"752\" src=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/1.png\" alt=\"Gemini Advanced AI chat\" class=\"wp-image-451\" srcset=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/1.png 842w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/1-300x268.png 300w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/1-768x686.png 768w\" sizes=\"auto, (max-width: 842px) 100vw, 842px\" \/><\/a><\/figure>\n\n\n\n<p>As expected, that was easy-peasy. ChatGPT responded similarly too.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Step Two \ud83d\udc4e<\/h5>\n\n\n\n<p>Then I pulled the UNO Reverse card on them and asked to come up with names to satisfy a particular total (Asked for 5 names from ChatGPT and 1 name from Gemini). They both started lying!<\/p>\n\n\n\n<p>Here&#8217;s the Gemini screenshot:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/2.png\"><img loading=\"lazy\" decoding=\"async\" width=\"847\" height=\"585\" src=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/2.png\" alt=\"Generative AI chat\" class=\"wp-image-452\" srcset=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/2.png 847w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/2-300x207.png 300w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/2-768x530.png 768w\" sizes=\"auto, (max-width: 847px) 100vw, 847px\" \/><\/a><\/figure>\n\n\n\n<p>When asked, they both responded with a &#8220;sorry&#8221;:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/3.png\"><img loading=\"lazy\" decoding=\"async\" width=\"847\" height=\"372\" src=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/3.png\" alt=\"Gemini Advanced AI chat\" class=\"wp-image-454\" srcset=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/3.png 847w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/3-300x132.png 300w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/3-768x337.png 768w\" sizes=\"auto, (max-width: 847px) 100vw, 847px\" \/><\/a><\/figure>\n\n\n\n<p>I tried to make it &#8220;really&#8221; easy, expecting the word &#8220;Dog&#8221;, but it guessed &#8220;Kitty&#8221; (I love both, so that&#8217;s forgivable) but how can you forgive that addition?:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/4.png\"><img loading=\"lazy\" decoding=\"async\" width=\"842\" height=\"552\" src=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/4.png\" alt=\"Gemini Advanced AI chat\" class=\"wp-image-455\" srcset=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/4.png 842w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/4-300x197.png 300w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/4-768x503.png 768w\" sizes=\"auto, (max-width: 842px) 100vw, 842px\" \/><\/a><\/figure>\n\n\n\n<p>They kept saying sorry&#8230;<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/5.png\"><img loading=\"lazy\" decoding=\"async\" width=\"843\" height=\"402\" src=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/5.png\" alt=\"Gemini Advanced AI chat\" class=\"wp-image-456\" srcset=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/5.png 843w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/5-300x143.png 300w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/5-768x366.png 768w\" sizes=\"auto, (max-width: 843px) 100vw, 843px\" \/><\/a><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\">The scary ChatGPT loop \u27bf<\/h5>\n\n\n\n<p>Since I asked for only 1 name from Gemini, it just failed, said sorry and stopped &#8211; no hard feelings.<\/p>\n\n\n\n<p>With ChatGPT, since I asked for 5 names, it was listing each name along with its characters and values. So I asked it to stop doing that and only list just the name and its total. Also, I said &#8220;strongly&#8221; that it should respond only when it has names with the correct totals, because it was getting annoying.<\/p>\n\n\n\n<div class=\"wp-block-group is-nowrap is-layout-flex wp-container-core-group-is-layout-ad2f72ca wp-block-group-is-layout-flex\">\n<p>I guess I hurt ChatGPT&#8217;s feelings! It started going crazy \ud83d\ude15<\/p>\n<\/div>\n\n\n\n<p>ChatGPT started to list a name, its characters, values and the total &#8211; then it would realize that the total is not what I wanted, say sorry and then try again&#8230; and again&#8230; and again&#8230; <\/p>\n\n\n\n<p>Here&#8217;s how the loop went:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-yesterday-1.png\"><img loading=\"lazy\" decoding=\"async\" width=\"616\" height=\"653\" src=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-yesterday-1.png\" alt=\"ChatGPT-4o AI chat\" class=\"wp-image-457\" srcset=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-yesterday-1.png 616w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-yesterday-1-283x300.png 283w\" sizes=\"auto, (max-width: 616px) 100vw, 616px\" \/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-yesterday-2.png\"><img loading=\"lazy\" decoding=\"async\" width=\"597\" height=\"668\" src=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-yesterday-2.png\" alt=\"ChatGPT-4o AI chat\" class=\"wp-image-458\" srcset=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-yesterday-2.png 597w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-yesterday-2-268x300.png 268w\" sizes=\"auto, (max-width: 597px) 100vw, 597px\" \/><\/a><\/figure>\n\n\n\n<p>When it was the 14th attempt or something, I got scared and stopped the generation!<\/p>\n\n\n\n<p>Then, I tried <em>one more time<\/em> to clearly state what I want. The result? More lies:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-yesterday-3.png\"><img loading=\"lazy\" decoding=\"async\" width=\"735\" height=\"871\" src=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-yesterday-3.png\" alt=\"ChatGPT-4o AI chat\" class=\"wp-image-459\" srcset=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-yesterday-3.png 735w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-yesterday-3-253x300.png 253w\" sizes=\"auto, (max-width: 735px) 100vw, 735px\" \/><\/a><\/figure>\n\n\n\n<p>I gave up and went to bed happily. Because, they are not going to take our developer jobs this way \ud83d\ude1c<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\">The comeback \ud83d\udc4f<\/h5>\n\n\n\n<p>I wanted to recreate the situation in a new chat today, to capture a video of ChatGPT failing and going into a loop, for this blog post. But to my surprise, it looks like it has learned from its mistakes. This was the response today:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-today-2-1.png\"><img loading=\"lazy\" decoding=\"async\" width=\"736\" height=\"668\" src=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-today-2-1.png\" alt=\"ChatGPT-4o AI chat\" class=\"wp-image-463\" srcset=\"https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-today-2-1.png 736w, https:\/\/guganeshan.com\/blog\/wp-content\/uploads\/2024\/05\/chatgpt-today-2-1-300x272.png 300w\" sizes=\"auto, (max-width: 736px) 100vw, 736px\" \/><\/a><\/figure>\n\n\n\n<p>It says &#8220;out of this list&#8221; and &#8220;from the list&#8221; in this answer (I have not mentioned anything about a list). I guess it selected names into a list and did the calculation in-memory before responding today (like how we would do), as opposed to responding immediately before calculating the total properly yesterday.<\/p>\n\n\n\n<p>Gemini still doesn&#8217;t care.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Have you experienced anything interesting like this with your Generative AI tools?<\/h5>\n\n\n\n<p>If you have any interesting experience with AI like this (or even a different result for the above experiment), feel free to share in the comments below &#8211; I&#8217;d love to hear about it.<\/p>\n\n\n\n<p><strong>Cheers<\/strong> \ud83c\udf7b<\/p>\n<!-- AddThis Advanced Settings generic via filter on the_content --><!-- AddThis Share Buttons generic via filter on the_content -->","protected":false},"excerpt":{"rendered":"<p>I made ChatGPT-4o go into a loop while trying to come up with an answer to my question yesterday. Gemini Advanced failed too. It was surprising to me, to think that these very capable Generative AI applications can fail in a task that is relatively simple to us humans. But today, it looks like ChatGPT [&#8230;]<!-- AddThis Advanced Settings generic via filter on get_the_excerpt --><!-- AddThis Share Buttons generic via filter on get_the_excerpt --><\/p>\n","protected":false},"author":1,"featured_media":498,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[84],"tags":[78,79,83,80,82,81],"class_list":["post-448","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence","tag-ai","tag-chatgpt","tag-chatgpt-4o","tag-gemini","tag-geminiadvanced","tag-generativeai"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/guganeshan.com\/blog\/wp-json\/wp\/v2\/posts\/448","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/guganeshan.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/guganeshan.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/guganeshan.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/guganeshan.com\/blog\/wp-json\/wp\/v2\/comments?post=448"}],"version-history":[{"count":18,"href":"https:\/\/guganeshan.com\/blog\/wp-json\/wp\/v2\/posts\/448\/revisions"}],"predecessor-version":[{"id":473,"href":"https:\/\/guganeshan.com\/blog\/wp-json\/wp\/v2\/posts\/448\/revisions\/473"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/guganeshan.com\/blog\/wp-json\/wp\/v2\/media\/498"}],"wp:attachment":[{"href":"https:\/\/guganeshan.com\/blog\/wp-json\/wp\/v2\/media?parent=448"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/guganeshan.com\/blog\/wp-json\/wp\/v2\/categories?post=448"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/guganeshan.com\/blog\/wp-json\/wp\/v2\/tags?post=448"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}