{"id":2184,"date":"2025-03-04T07:02:28","date_gmt":"2025-03-04T07:02:28","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/03\/04\/avoidable-and-unavoidable-randomness-in-gpt-4o\/"},"modified":"2025-03-04T07:02:28","modified_gmt":"2025-03-04T07:02:28","slug":"avoidable-and-unavoidable-randomness-in-gpt-4o","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/03\/04\/avoidable-and-unavoidable-randomness-in-gpt-4o\/","title":{"rendered":"Avoidable and Unavoidable Randomness in GPT-4o"},"content":{"rendered":"<p>    Avoidable and Unavoidable Randomness in GPT-4o<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p class=\"wp-block-paragraph\">Of course there is randomness in GPT-4o\u2019s outputs. After all, the model samples from a probability distribution when choosing each token. But what I didn\u2019t understand was that those very probabilities themselves are not deterministic. Even with consistent prompts, fixed seeds, and temperature set to zero, GPT-4o still introduces subtle, frustrating randomness.<\/p>\n<p class=\"wp-block-paragraph\">There\u2019s no fix for this, and it might not even be something OpenAI <em>could<\/em> fix if they wanted to, just so we\u2019re clear up front about where this article is headed. Along the way, we\u2019ll examine all the sources of randomness in GPT-4o output, which will require us to break down the sampling process to a low level. We\u2019ll point at the issue\u2014the probabilities vary\u2014and critically examine OpenAI\u2019s official guidance on determinism.<\/p>\n<p class=\"wp-block-paragraph\">First, though, let\u2019s talk about why determinism matters. Determinism means that the same input always produces the same output, like a mathematical function. While LLM creativity is often desirable, determinism serves crucial purposes: researchers need it for reproducible experiments, developers for verifying reported results, and prompt engineers for debugging their changes. Without it, you\u2019re left wondering if different outputs stem from your tweaks or just the random number generator\u2019s mood swings.<\/p>\n<h2 class=\"wp-block-heading\">Flipping a coin<\/h2>\n<p class=\"wp-block-paragraph\">We\u2019re going to keep things extremely simple here and prompt the most recent version of GPT-4o (<code>gpt-4o-2024-08-06<\/code> in the API) with this:<\/p>\n<p class=\"wp-block-paragraph\"><code>\u00a0Flip a coin. Return Heads or Tails only.<\/code><\/p>\n<p class=\"wp-block-paragraph\">Flipping a coin with LLMs is a fascinating topic in itself (see for example Van Koevering &amp; Kleinberg, 2024 in the references), but here, we\u2019ll use it as a simple binary question with which to explore determinism, or the lack thereof.<\/p>\n<p class=\"wp-block-paragraph\">This is our first attempt.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import os\nfrom openai import OpenAI\nclient = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))\n\nprompt = 'Flip a coin. Return Heads or Tails only.'\n\nresponse = client.chat.completions.create(\n\u00a0 \u00a0 model='gpt-4o-2024-08-06',\n\u00a0 \u00a0 messages=[{'role': 'user', 'content': prompt}],\n)\n\nprint(response.choices[0].message.content)<\/code><\/pre>\n<p class=\"wp-block-paragraph\">Running the code gave me <code>Heads<\/code>. Maybe you\u2019ll get <code>Tails<\/code>, or if you\u2019re really lucky, something far more interesting.<\/p>\n<p class=\"wp-block-paragraph\">The code first initializes an OpenAI client with an API key set in the environment variable <code>OPENAI_API_KEY<\/code> (to avoid sharing billing credentials here). The main action happens with client.chat.completions.create, where we specify the model to use and send the prompt (as a part of a very simple conversation named messages) to the server. We get an object called response back from the server. This object contains a lot of information, as shown below, so we need to dig into it to extract GPT-4o\u2019s actual response to the message, which is <code>response.choices[0].message.content<\/code>.<\/p>\n<pre class=\"wp-block-code\"><code>&gt;&gt;&gt; response\nChatCompletion(id='chatcmpl-B48EqZBLfUWtp9H7cwnchGTJbBDwr', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Heads', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None))], created=1740324680, model='gpt-4o-2024-08-06', object='chat.completion', service_tier='default', system_fingerprint='fp_eb9dce56a8', usage=CompletionUsage(completion_tokens=2, prompt_tokens=18, total_tokens=20, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))<\/code><\/pre>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><\/blockquote>\n<p class=\"wp-block-paragraph\">Now let\u2019s flip the coin ten times. If this were a real, fair coin, of course, we would expect roughly equal heads and tails over time thanks to the law of large numbers. But GPT-4o\u2019s coin doesn\u2019t work quite like that.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import os\nfrom openai import OpenAI\nclient = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))\n\nprompt = 'Flip a coin. Return Heads or Tails only.'\n\nfor _ in range(10):\n\u00a0 \u00a0 response = client.chat.completions.create(\n\u00a0 \u00a0 \u00a0 \u00a0 model='gpt-4o-2024-08-06',\n\u00a0 \u00a0 \u00a0 \u00a0 messages=[{'role': 'user', 'content': prompt}],\n\u00a0 \u00a0 )\n\u00a0 \u00a0 print(response.choices[0].message.content)<\/code><\/pre>\n<p class=\"wp-block-paragraph\">Running this code gave me the following output, although you might get different output, of course.<\/p>\n<pre class=\"wp-block-code\"><code>Heads\nHeads\nHeads\nHeads\nHeads\nHeads\nTails\nHeads\nHeads\nHeads<\/code><\/pre>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><\/blockquote>\n<p class=\"wp-block-paragraph\">GPT-4o\u2019s coin is clearly biased, but so are humans. Bar-Hillel, Peer, and Acquisti (2014) found that people flipping imaginary coins choose \u201cheads\u201d 80% of the time. Maybe GPT-4o learned that from us. But whatever the reason, we\u2019re just using this simple example to explore determinism.<\/p>\n<h2 class=\"wp-block-heading\">Just how biased is GPT-4o\u2019s coin?<\/h2>\n<p class=\"wp-block-paragraph\">Let\u2019s say we wanted to know precisely what percentage of GPT-4o coin flips land Heads.<\/p>\n<p class=\"wp-block-paragraph\">Rather than the obvious (but expensive) approach of flipping it a million times, there\u2019s a smarter way. For classification tasks with a small set of possible answers, we can extract token probabilities instead of generating full responses. With the right prompt, the first token carries all the necessary information, making these API calls incredibly cheap: around 30,000 calls per dollar, since each requires just 18 (cached) input tokens and 1 output token.<\/p>\n<p class=\"wp-block-paragraph\">OpenAI gives us (natural) log probabilities. These are called logprobs in the code, and we convert them to regular probabilities by exponentiation. (We\u2019ll discuss temperature soon, but note that exponentiating logprobs directly like this corresponds to a temperature setting of 1.0, and is how we calculate probabilities throughout this article). OpenAI lets us request logprobs for the top 20 most likely tokens, so we do that.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import os\nimport math\nfrom openai import OpenAI\nfrom tabulate import tabulate\n\nclient = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))\n\nprompt = 'Flip a coin. Return Heads or Tails only.'\n\nresponse = client.chat.completions.create(\n\u00a0 \u00a0 model='gpt-4o-2024-08-06',\n\u00a0 \u00a0 max_tokens=1,\n\u00a0 \u00a0 logprobs=True,\n\u00a0 \u00a0 top_logprobs=20,\n\u00a0 \u00a0 messages=[{'role': 'user', 'content': prompt}],\n)\n\nlogprobs_list = response.choices[0].logprobs.content[0].top_logprobs\n\ndata = []\ntotal_pct = 0.0\n\nfor logprob_entry in logprobs_list:\n\u00a0 \u00a0 token = logprob_entry.token\n\u00a0 \u00a0 logprob = logprob_entry.logprob\n\u00a0 \u00a0 pct = math.exp(logprob) * 100\u00a0 # Convert logprob to a percentage\n\u00a0 \u00a0 total_pct += pct\n\u00a0 \u00a0 data.append([token, logprob, pct])\n\nprint(\n\u00a0 \u00a0 tabulate(\n\u00a0 \u00a0 \u00a0 \u00a0 data,\n\u00a0 \u00a0 \u00a0 \u00a0 headers=[\"Token\", \"Log Probability\", \"Percentage (%)\"],\n\u00a0 \u00a0 \u00a0 \u00a0 tablefmt=\"github\",\n\u00a0 \u00a0 \u00a0 \u00a0 floatfmt=(\"s\", \".10f\", \".10f\")\n\u00a0 \u00a0 )\n)\nprint(f\"nTotal probabilities: {total_pct:.6f}%\")<\/code><\/pre>\n<p class=\"wp-block-paragraph\">If you run this, you\u2019ll get <em>something<\/em> like the following output, but actual numbers <em>will<\/em> vary.<\/p>\n<pre class=\"wp-block-code\"><code>| Token \u00a0 \u00a0 | \u00a0 Log Probability | \u00a0 Percentage (%) |\n|-----------|-------------------|------------------|\n| Heads \u00a0 \u00a0 | \u00a0 \u00a0 -0.0380541235 |\u00a0 \u00a0 96.2660836887 |\n| T \u00a0 \u00a0 \u00a0 \u00a0 | \u00a0 \u00a0 -3.2880542278 | \u00a0 \u00a0 3.7326407467 |\n| Sure\u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -12.5380544662 | \u00a0 \u00a0 0.0003587502 |\n| Head\u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -12.7880544662 | \u00a0 \u00a0 0.0002793949 |\n| Tail\u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -13.2880544662 | \u00a0 \u00a0 0.0001694616 |\n| Certainly |\u00a0 \u00a0 -13.5380544662 | \u00a0 \u00a0 0.0001319768 |\n| \"T\u00a0 \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -14.2880544662 | \u00a0 \u00a0 0.0000623414 |\n| I'm \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -14.5380544662 | \u00a0 \u00a0 0.0000485516 |\n| heads \u00a0 \u00a0 |\u00a0 \u00a0 -14.5380544662 | \u00a0 \u00a0 0.0000485516 |\n| Heads \u00a0 \u00a0 |\u00a0 \u00a0 -14.9130544662 | \u00a0 \u00a0 0.0000333690 |\n| \" \u00a0 \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -15.1630544662 | \u00a0 \u00a0 0.0000259878 |\n| _heads\u00a0 \u00a0 |\u00a0 \u00a0 -15.1630544662 | \u00a0 \u00a0 0.0000259878 |\n| tails \u00a0 \u00a0 |\u00a0 \u00a0 -15.5380544662 | \u00a0 \u00a0 0.0000178611 |\n| HEAD\u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -15.7880544662 | \u00a0 \u00a0 0.0000139103 |\n| TAIL\u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -16.2880535126 | \u00a0 \u00a0 0.0000084370 |\n| T \u00a0 \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -16.7880535126 | \u00a0 \u00a0 0.0000051173 |\n| ``` \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -16.7880535126 | \u00a0 \u00a0 0.0000051173 |\n| Here's\u00a0 \u00a0 |\u00a0 \u00a0 -16.9130535126 | \u00a0 \u00a0 0.0000045160 |\n| I \u00a0 \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -17.2880535126 | \u00a0 \u00a0 0.0000031038 |\n| As\u00a0 \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -17.2880535126 | \u00a0 \u00a0 0.0000031038 |\n\nTotal probabilities: 99.999970%<\/code><\/pre>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><\/blockquote>\n<p class=\"wp-block-paragraph\">Looking at these probabilities, we see Heads at \u224896% and T at \u22484%. Our prompt is doing pretty well at constraining the model\u2019s responses. Why <code>T<\/code> and not <code>Tails<\/code>? This is the tokenizer splitting <code>Tails<\/code> into <code>T<\/code> + <code>ails<\/code>, while keeping <code>Heads<\/code> as one piece, as we can see in this Python session:<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">&gt;&gt;&gt; import tiktoken\n&gt;&gt;&gt; encoding = tiktoken.encoding_for_model(\"gpt-4o-2024-08-06\")\n&gt;&gt;&gt; encoding.encode('Tails')\n[51, 2196]\n&gt;&gt;&gt; encoding.decode([51])\n'T'\n&gt;&gt;&gt; encoding.encode('Heads')\n[181043]<\/code><\/pre>\n<h2 class=\"wp-block-heading\">These probabilities are not deterministic<\/h2>\n<p class=\"wp-block-paragraph\">Run the code to display the probabilities for the top 20 tokens again, and you\u2019ll likely get different numbers. Here\u2019s what I got on a second running.<\/p>\n<pre class=\"wp-block-code\"><code>| Token \u00a0 \u00a0 | \u00a0 Log Probability | \u00a0 Percentage (%) |\n|-----------|-------------------|------------------|\n| Heads \u00a0 \u00a0 | \u00a0 \u00a0 -0.0110520627 |\u00a0 \u00a0 98.9008786933 |\n| T \u00a0 \u00a0 \u00a0 \u00a0 | \u00a0 \u00a0 -4.5110521317 | \u00a0 \u00a0 1.0986894433 |\n| Certainly |\u00a0 \u00a0 -14.0110521317 | \u00a0 \u00a0 0.0000822389 |\n| Head\u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -14.2610521317 | \u00a0 \u00a0 0.0000640477 |\n| Sure\u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -14.2610521317 | \u00a0 \u00a0 0.0000640477 |\n| Tail\u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -14.3860521317 | \u00a0 \u00a0 0.0000565219 |\n| heads \u00a0 \u00a0 |\u00a0 \u00a0 -15.3860521317 | \u00a0 \u00a0 0.0000207933 |\n| Heads \u00a0 \u00a0 |\u00a0 \u00a0 -15.5110521317 | \u00a0 \u00a0 0.0000183500 |\n| ``` \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -15.5110521317 | \u00a0 \u00a0 0.0000183500 |\n| _heads\u00a0 \u00a0 |\u00a0 \u00a0 -15.6360521317 | \u00a0 \u00a0 0.0000161938 |\n| tails \u00a0 \u00a0 |\u00a0 \u00a0 -15.6360521317 | \u00a0 \u00a0 0.0000161938 |\n| I'm \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -15.8860521317 | \u00a0 \u00a0 0.0000126117 |\n| \"T\u00a0 \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -15.8860521317 | \u00a0 \u00a0 0.0000126117 |\n| As\u00a0 \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -16.3860511780 | \u00a0 \u00a0 0.0000076494 |\n| \" \u00a0 \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -16.5110511780 | \u00a0 \u00a0 0.0000067506 |\n| HEAD\u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -16.6360511780 | \u00a0 \u00a0 0.0000059574 |\n| TAIL\u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -16.7610511780 | \u00a0 \u00a0 0.0000052574 |\n| Here's\u00a0 \u00a0 |\u00a0 \u00a0 -16.7610511780 | \u00a0 \u00a0 0.0000052574 |\n| ``\u00a0 \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -17.1360511780 | \u00a0 \u00a0 0.0000036133 |\n| T \u00a0 \u00a0 \u00a0 \u00a0 |\u00a0 \u00a0 -17.6360511780 | \u00a0 \u00a0 0.0000021916 |\n\nTotal probabilities: 99.999987%<\/code><\/pre>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><\/blockquote>\n<p class=\"wp-block-paragraph\">In their <a href=\"https:\/\/cookbook.openai.com\/examples\/reproducible_outputs_with_the_seed_parameter\">cookbook<\/a>, OpenAI offers the following advice on receiving \u201cmostly identical\u201d outputs:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">If the <code>seed<\/code>, request parameters, and <code>system_fingerprint<\/code> all match across your requests, then model outputs will mostly be identical. There is a small chance that responses differ even when request parameters and <code>system_fingerprint<\/code> match, due to the inherent non-determinism of our models.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">They also give \u201cmostly identical\u201d advice in the <a href=\"https:\/\/platform.openai.com\/docs\/advanced-usage\/reproducible-outputs\">reproducible outputs section<\/a> of their documentation.<\/p>\n<p class=\"wp-block-paragraph\">The request parameters that could affect randomness are <code>temperature<\/code> and <code>seed<\/code>. OpenAI also suggests we track <code>system_fingerprint<\/code>, because differences here might cause differences in output. We\u2019ll examine each of these below, but spoiler: none of them will fix or even explain this non-determinism.<\/p>\n<h2 class=\"wp-block-heading\">Temperature, and why it won\u2019t fix this<\/h2>\n<p class=\"wp-block-paragraph\">Temperature controls how random the model\u2019s responses are. Low temperatures (&lt;0.5) make it robotic and predictable, medium temperatures (0.7\u20131.3) allow some creativity, and high temperatures (&gt;1.5) produce gibberish. Temperature is often called the \u201ccreativity parameter\u201d, but this is an oversimplification. In their analysis, Peeperkorn, Kouwenhoven, Brown, and Jordanous (2024) evaluated LLM outputs across four dimensions of creativity: novelty (originality), coherence (logical consistency), cohesion (how well the text flows), and typicality (how well it fits expected patterns). They observed that:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">temperature is weakly correlated with novelty, and unsurprisingly, moderately correlated with incoherence, but there is no relationship with either cohesion or typicality.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">But, this is beside the point for coin flipping. Under the hood, the log probabilities are divided by the temperature before they\u2019re renormalized and exponentiated to be converted to probabilities. This creates a non-linear effect: <code>temperature=0.5<\/code> squares the probabilities, making likely tokens dominate, while <code>temperature=2.0<\/code> applies a square root, flattening the distribution.<\/p>\n<p class=\"wp-block-paragraph\">What about <code>temperature=0.0<\/code>? Instead of breaking math dividing by zero, the model simply picks the highest-probability token. Sounds deterministic, right? Not quite. Here\u2019s the catch: temperature only comes into play after the log probabilities are computed, when we convert them to probabilities.<\/p>\n<p class=\"wp-block-paragraph\">In summary: <strong>if the logprobs aren\u2019t deterministic, setting temperature to 0.0 won\u2019t make the model deterministic<\/strong>.<\/p>\n<p class=\"wp-block-paragraph\">In fact, since we\u2019re just asking the model for the raw logprobs directly rather than generating full responses, the temperature setting doesn\u2019t come into play in our code at all.<\/p>\n<h2 class=\"wp-block-heading\">Seeds, and why they won\u2019t fix this<\/h2>\n<p class=\"wp-block-paragraph\">After temperature is used to compute probabilities, the model samples from these probabilities to pick the next token. OpenAI gives us a little control over the sampling process by letting us set the <code>seed<\/code> parameter for the random number generator. In an ideal world, setting a seed would give us determinism at any temperature. But seeds only affect sampling, not the log probabilities before sampling.<\/p>\n<p class=\"wp-block-paragraph\">In summary: <strong>if the logprobs aren\u2019t deterministic, setting a seed won\u2019t make the model deterministic<\/strong>.<\/p>\n<p class=\"wp-block-paragraph\">In fact, <code>seed<\/code> only matters with non-zero temperatures. With <code>temperature=0.0<\/code>, the model is always choosing the highest probability token regardless of the seed. Again, since we\u2019re just asking the model for the raw logprobs directly rather than sampling, neither of these settings can help us achieve determinism.<\/p>\n<h2 class=\"wp-block-heading\">System fingerprints, our last hope<\/h2>\n<p class=\"wp-block-paragraph\">The <code>system_fingerprint<\/code> identifies the current combination of model weights, infrastructure, and configuration options in OpenAI\u2019s backend. At least, that\u2019s what OpenAI tells us. Variations in system fingerprints might indeed explain variations in logprobs. Except that they don\u2019t, as we will verify below.<\/p>\n<h2 class=\"wp-block-heading\">Nothing can get you determinism<\/h2>\n<p class=\"wp-block-paragraph\">Let\u2019s confirm what we\u2019ve been building toward. We\u2019ll run the same request 10 times with every safeguard in place. Even though neither of these parameters <em>should<\/em> matter for what we\u2019re doing, you can never be too safe, so we\u2019ll set <code>temperature=0.0<\/code> and <code>seed=42<\/code>. And to see if infrastructure differences explain our varying logprobs, we\u2019ll print <code>system_fingerprint<\/code>. Here\u2019s the code:<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import os\nimport math\nfrom openai import OpenAI\nfrom tabulate import tabulate\nfrom tqdm import tqdm\n\nclient = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))\n\nprompt = 'Flip a coin. Return Heads or Tails only.'\n\ndata = []\n\nfor _ in tqdm(range(10), desc='Generating responses'):\n\u00a0 \u00a0 response = client.chat.completions.create(\n\u00a0 \u00a0 \u00a0 \u00a0 model='gpt-4o-2024-08-06',\n\u00a0 \u00a0 \u00a0 \u00a0 temperature=0.0,\n\u00a0 \u00a0 \u00a0 \u00a0 seed=42,\n\u00a0 \u00a0 \u00a0 \u00a0 max_tokens=1,\n\u00a0 \u00a0 \u00a0 \u00a0 logprobs=True,\n\u00a0 \u00a0 \u00a0 \u00a0 top_logprobs=20,\n\u00a0 \u00a0 \u00a0 \u00a0 messages=[{'role': 'user', 'content': prompt}],\n\u00a0 \u00a0 )\n\n\u00a0 \u00a0 fingerprint = response.system_fingerprint\n\u00a0 \u00a0 logprobs_list = response.choices[0].logprobs.content[0].top_logprobs\n\u00a0 \u00a0 heads_logprob = next(\n\u00a0 \u00a0 \u00a0 \u00a0 entry.logprob for entry in logprobs_list if entry.token == 'Heads'\n\u00a0 \u00a0 )\n\u00a0 \u00a0 pct = math.exp(heads_logprob) * 100\n\u00a0 \u00a0 data.append([fingerprint, heads_logprob, f\"{pct:.10f}%\"])\n\nheaders = [\"Fingerprint\", \"Logprob\", \"Probability\"]\nprint(tabulate(data, headers=headers, tablefmt=\"pipe\"))<\/code><\/pre>\n<p class=\"wp-block-paragraph\">Running this 10 times, here are the logprobs and probabilities for the token Heads:<\/p>\n<pre class=\"wp-block-code\"><code>| Fingerprint \u00a0 |\u00a0 \u00a0 Logprob | Probability\u00a0 \u00a0 |\n|---------------|------------|----------------|\n| fp_f9f4fb6dbf | -0.0380541 | 96.2660836887% |\n| fp_f9f4fb6dbf | -0.0380541 | 96.2660836887% |\n| fp_f9f4fb6dbf | -0.0380541 | 96.2660836887% |\n| fp_f9f4fb6dbf | -0.0380541 | 96.2660836887% |\n| fp_f9f4fb6dbf | -0.160339\u00a0 | 85.1854886858% |\n| fp_f9f4fb6dbf | -0.0380541 | 96.2660836887% |\n| fp_f9f4fb6dbf | -0.0110521 | 98.9008786933% |\n| fp_f9f4fb6dbf | -0.0380541 | 96.2660836887% |\n| fp_f9f4fb6dbf | -0.0380541 | 96.2660836887% |\n| fp_f9f4fb6dbf | -0.0380541 | 96.2660836887% |<\/code><\/pre>\n<h2 class=\"wp-block-heading\">Mixture-of-experts makes determinism impossible<\/h2>\n<p class=\"wp-block-paragraph\">OpenAI is decidedly not open about the architecture behind GPT-4o. However, it\u2019s widely believed that GPT-4o uses a mixture-of-experts (MoE) architecture with either 8 or 16 experts.<\/p>\n<p class=\"wp-block-paragraph\">According to a paper by Google DeepMind researchers Puigcerver, Riquelme, Mustafa, and Houlsby (hat tip to <a href=\"https:\/\/community.openai.com\/t\/why-the-api-output-is-inconsistent-even-after-the-temperature-is-set-to-0\/329541\/9\">user elmstedt on the OpenAI forum<\/a>), mixture-of-experts architectures may add an unavoidable level of non-determinism:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">Under capacity constraints, all Sparse MoE approaches route tokens in groups of a fixed size and enforce (or encourage) balance within the group. When groups contain tokens from different sequences or inputs, these tokens compete for available spots in expert buffers. Therefore, the model is no longer deterministic at the sequence-level, but only at the batch-level.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">In other words, when your prompt (a <em>sequence of tokens<\/em>, in the quote above) reaches OpenAI\u2019s servers, it gets batched with a group of other prompts (OpenAI isn\u2019t open about how many other prompts). Each prompt in the batch is then routed to an \u201cexpert\u201d within the model. However, since only so many prompts can be routed to the same expert, the expert your prompt gets routed to will depend on all the other prompts in the batch.<\/p>\n<p class=\"wp-block-paragraph\">This \u201ccompetition\u201d for experts introduces a real-world randomness completely beyond our control.<\/p>\n<h2 class=\"wp-block-heading\">Non-determinism beyond mixture-of-experts<\/h2>\n<p class=\"wp-block-paragraph\">While non-determinism may be inherent to real-world mixture-of-experts models, that does not seem to be the <em>only<\/em> source of non-determinism in OpenAI\u2019s models.<\/p>\n<p class=\"wp-block-paragraph\">Making a few changes to our code above (switching to <code>gpt-3.5-turbo-0125<\/code>, looking for the token <code>He<\/code> since GPT-3.5\u2019s tokenizer splits \u201cHeads\u201d differently, and ignoring system_fingerprint because this model doesn\u2019t have it) reveals that GPT-3.5-turbo <em>also<\/em> exhibits non-deterministic logprobs:<\/p>\n<pre class=\"wp-block-code\"><code>| \u00a0 \u00a0 Logprob | Probability\u00a0 \u00a0 |\n|-------------|----------------|\n| -0.00278289 | 99.7220983436% |\n| -0.00415331 | 99.5855302068% |\n| -0.00258838 | 99.7414961980% |\n| -0.00204034 | 99.7961735289% |\n| -0.00240277 | 99.7600117933% |\n| -0.00204034 | 99.7961735289% |\n| -0.00204034 | 99.7961735289% |\n| -0.00258838 | 99.7414961980% |\n| -0.00351419 | 99.6491976144% |\n| -0.00201214 | 99.7989878007% |<\/code><\/pre>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><\/blockquote>\n<p class=\"wp-block-paragraph\">No one is claiming that GPT-3.5-turbo uses a mixture-of-experts architecture. Thus, there must be additional factors beyond mixture-of-experts contributing to this non-determinism.<\/p>\n<h2 class=\"wp-block-heading\">What 10,000 GPT-4o coin flip probabilities tell us<\/h2>\n<p class=\"wp-block-paragraph\">To better understand the patterns and magnitude of this non-determinism, I conducted a more extensive experiment with GPT-4o, performing 10,000 \u201ccoin flips\u201d while recording the probability assigned to \u201cHeads\u201d in each case.<\/p>\n<p class=\"wp-block-paragraph\">The results reveal something fascinating. Across 10,000 API calls with identical parameters, GPT-4o produced not just a few different probability values, but <strong>42 distinct probabilities<\/strong>. If the mixture-of-experts hypothesis were the complete explanation for non-determinism in GPT-4o, we might expect to see one distinct probability for each expert. But GPT-4o is believed to have either 8 or 16 experts, <strong>not 42<\/strong>.<\/p>\n<p class=\"wp-block-paragraph\">In the output below, I clustered these probabilities, ensuring that each cluster was separated from the others by 0.01 (as a raw percentage). This groups the output into <strong>12 clusters<\/strong>.<\/p>\n<pre class=\"wp-block-code\"><code>Probability\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 Count \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 Fingerprints\n------------------------------------------------------------------\n85.1854379113% \u00a0 \u00a0 \u00a0 5 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n85.1854455275% \u00a0 \u00a0 \u00a0 74\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n85.1854886858% \u00a0 \u00a0 \u00a0 180 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n------------------------------------------------------------------\n88.0662448207% \u00a0 \u00a0 \u00a0 31\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n88.0678628883% \u00a0 \u00a0 \u00a0 2 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_f9f4fb6dbf\n------------------------------------------------------------------\n92.3997629747% \u00a0 \u00a0 \u00a0 1 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n92.3997733012% \u00a0 \u00a0 \u00a0 4 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n92.3997836277% \u00a0 \u00a0 \u00a0 3 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n------------------------------------------------------------------\n92.4128943690% \u00a0 \u00a0 \u00a0 1 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_f9f4fb6dbf\n92.4129143363% \u00a0 \u00a0 \u00a0 21\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n92.4129246643% \u00a0 \u00a0 \u00a0 8 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n------------------------------------------------------------------\n93.9906837191% \u00a0 \u00a0 \u00a0 4 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n------------------------------------------------------------------\n95.2569999350% \u00a0 \u00a0 \u00a0 36\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n------------------------------------------------------------------\n96.2660836887% \u00a0 \u00a0 \u00a0 3391\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n96.2661285161% \u00a0 \u00a0 \u00a0 2636\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n------------------------------------------------------------------\n97.0674551052% \u00a0 \u00a0 \u00a0 1 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n97.0674778863% \u00a0 \u00a0 \u00a0 3 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n97.0675003058% \u00a0 \u00a0 \u00a0 4 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n97.0675116963% \u00a0 \u00a0 \u00a0 1 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n97.0680739932% \u00a0 \u00a0 \u00a0 19\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n97.0681293191% \u00a0 \u00a0 \u00a0 6 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n97.0681521003% \u00a0 \u00a0 \u00a0 74\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n97.0682421405% \u00a0 \u00a0 \u00a0 4 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n------------------------------------------------------------------\n97.7008960695% \u00a0 \u00a0 \u00a0 1 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_f9f4fb6dbf\n97.7011122645% \u00a0 \u00a0 \u00a0 3 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n97.7011462953% \u00a0 \u00a0 \u00a0 3 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n97.7018178132% \u00a0 \u00a0 \u00a0 1 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n------------------------------------------------------------------\n98.2006069902% \u00a0 \u00a0 \u00a0 426 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n98.2006876548% \u00a0 \u00a0 \u00a0 6 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_f9f4fb6dbf\n98.2007107019% \u00a0 \u00a0 \u00a0 1 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n98.2009525133% \u00a0 \u00a0 \u00a0 5 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n98.2009751945% \u00a0 \u00a0 \u00a0 1 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n98.2009867181% \u00a0 \u00a0 \u00a0 1 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n------------------------------------------------------------------\n98.5930987656% \u00a0 \u00a0 \u00a0 3 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n98.5931104270% \u00a0 \u00a0 \u00a0 235 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n98.5931222721% \u00a0 \u00a0 \u00a0 4 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n98.5931340253% \u00a0 \u00a0 \u00a0 9 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n98.5931571644% \u00a0 \u00a0 \u00a0 159 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n98.5931805790% \u00a0 \u00a0 \u00a0 384 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8\n------------------------------------------------------------------\n98.9008436920% \u00a0 \u00a0 \u00a0 95\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n98.9008550214% \u00a0 \u00a0 \u00a0 362 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf\n98.9008786933% \u00a0 \u00a0 \u00a0 1792\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fp_eb9dce56a8, fp_f9f4fb6dbf<\/code><\/pre>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><\/blockquote>\n<p class=\"wp-block-paragraph\">(With a threshold of 0.001 there are 13 clusters, and with a threshold of 0.0001 there are 17 clusters.)<\/p>\n<p class=\"wp-block-paragraph\">As the chart above demonstrates, this multitude of results cannot be explained by <code>system_fingerprint<\/code> values. Across all 10,000 calls, I received only two different system fingerprints: 4488 results with <code>fp_f9f4fb6dbf<\/code> and 5512 with <code>fp_eb9dce56a8<\/code>, and for the most part the two system fingerprints returned the same sets probabilities, rather than each fingerprint producing its own distinct set of probabilities.<\/p>\n<p class=\"wp-block-paragraph\">It <em>could<\/em> be that these 12 clusters of probabilities represent <strong>12 different experts<\/strong>. Even assuming that, the variations within the clusters remain puzzling. These don\u2019t seem likely to be simple rounding errors, because they are too systematic and consistent. Take the giant cluster at around 96.266% with two distinct probabilities representing over half of our coin flips. The difference between these two probabilities, 0.0000448274%, is tiny but persistent.<\/p>\n<h2 class=\"wp-block-heading\">Conclusion: Non-determinism is baked in<\/h2>\n<p class=\"wp-block-paragraph\">There is an underlying randomness in the log probabilities returned by all currently available non-thinking OpenAI models: GPT-4o, GPT-4o-mini, and the two flavors of GPT-3.5-turbo. Because this non-determinism is baked into the log probabilities, there\u2019s no way for a user to get around it. Temperature and seed values have no effect, and system fingerprints don\u2019t explain it.<\/p>\n<p class=\"wp-block-paragraph\">While mixture-of-experts architectures inherently introduce some randomness in the competition for experts, the non-determinism in GPT-4o seems to go far beyond this, and the non-determinism in GPT-3.5-turbo can\u2019t be explained by this at all, because GPT-3.5-turbo isn\u2019t a mixture-of-experts model.<\/p>\n<p class=\"wp-block-paragraph\">While we can\u2019t verify this claim any more because the model isn\u2019t being served, this behaviour wasn\u2019t seen with GPT-3, according to <a href=\"https:\/\/medium.com\/r\/?url=https%3A%2F%2Fcommunity.openai.com%2Ft%2Fnon-deterministic-probabilities-for-first-generated-token-in-chat-completion%2F726074%2F5\">user _j on the OpenAI forum<\/a>:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">It is a symptom that was not seen on prior GPT-3 AI models where across hundreds of trials to investigate sampling, you never had to doubt that logprobs would be the same. Even if you found a top-2 answer that returned exactly the same logprob value via the API, you would never see them switch position or return different values.<\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">This suggests that whatever is causing this randomness first emerged in either GPT-3.5 or GPT-3.5-turbo.<\/p>\n<p class=\"wp-block-paragraph\">But regardless of when it emerged, this non-determinism is a serious obstacle to understanding these models. If you want to study a model\u2014how it generalizes, how it biases responses, how it assigns probabilities to different tokens\u2014you need consistency. but as we\u2019ve seen, even when we lock down every knob OpenAI lets us touch, we still can\u2019t get an answer to the simplest possible question: <strong>\u201cwhat is the probability that GPT-4o says a coin lands heads?\u201d<\/strong><\/p>\n<p class=\"wp-block-paragraph\">Worse, while mixture-of-experts explains some of this non-determinism, there are clearly other, hidden sources of randomness that we can\u2019t see, control, or understand. In an ideal world, the API would provide more transparency by telling us which expert processed our request or by offering additional parameters to control this routing process. Without such visibility, we\u2019re left guessing at the true nature of the variability.<\/p>\n<h2 class=\"wp-block-heading\">References<\/h2>\n<p class=\"wp-block-paragraph\">Bar-Hillel, M., Peer, E., &amp; Acquisti, A. (2014). \u201cHeads or tails?\u201d \u2013 A reachability bias in binary choice. Journal of Experimental Psychology: Learning, Memory, and Cognition, 40(6), 1656\u20131663. <a href=\"https:\/\/doi.org\/10.1037\/xlm0000005\">https:\/\/doi.org\/10.1037\/xlm0000005<\/a>.<\/p>\n<p class=\"wp-block-paragraph\">Peeperkorn, M., Kouwenhoven, T., Brown, D., &amp; Jordanous, A. (2024). Is temperature the creativity parameter of <a href=\"https:\/\/towardsdatascience.com\/tag\/large-language-models\/\" title=\"Large Language Models\">Large Language Models<\/a>?. In The 15th International Conference on Computational Creativity (ICCC\u201924). <a href=\"https:\/\/arxiv.org\/abs\/2405.00492\">arXiv:2405.00492<\/a>.<\/p>\n<p class=\"wp-block-paragraph\">Puigcerver, J., Riquelme, C., Mustafa, B., &amp; Houlsby, N. (2024). From sparse to soft mixtures of experts. In The Twelfth International Conference on Learning Representations (ICLR 2024). <a href=\"https:\/\/openreview.net\/forum?id=jxpsAj7ltE\">https:\/\/openreview.net\/forum?id=jxpsAj7ltE<\/a>. <a href=\"https:\/\/arxiv.org\/abs\/2308.00951\">arXiv:2308.00951<\/a>.Van Koevering, K., &amp; Kleinberg, J. (2024). How random is random? Evaluating the <a href=\"https:\/\/towardsdatascience.com\/tag\/randomness\/\" title=\"Randomness\">Randomness<\/a> and humanness of LLMs\u2019 coin flips. <a href=\"https:\/\/arxiv.org\/abs\/2406.00092\">arXiv:2406.00092<\/a>.<\/p>\n<p>The post <a href=\"https:\/\/towardsdatascience.com\/avoidable-and-unavoidable-randomness-in-gpt-4o\/\">Avoidable and Unavoidable Randomness in GPT-4o<\/a> appeared first on <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a>.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Vincent Vatter<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/avoidable-and-unavoidable-randomness-in-gpt-4o\/\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Avoidable and Unavoidable Randomness in GPT-4o Of course there is randomness in GPT-4o\u2019s outputs. After all, the model samples from a probability distribution when choosing each token. But what I didn\u2019t understand was that those very probabilities themselves are not deterministic. Even with consistent prompts, fixed seeds, and temperature set to zero, GPT-4o still introduces [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,1904,240,1905,71,70,1906],"tags":[1163,1907,1244],"class_list":["post-2184","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-deterministic-models","category-editors-pick","category-gpt-4o","category-large-language-models","category-machine-learning","category-randomness","tag-gpt","tag-o","tag-openai"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/2184"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=2184"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/2184\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=2184"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=2184"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=2184"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}