{"id":3592,"date":"2025-05-06T07:04:46","date_gmt":"2025-05-06T07:04:46","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/05\/06\/making-sense-of-kpi-changes\/"},"modified":"2025-05-06T07:04:46","modified_gmt":"2025-05-06T07:04:46","slug":"making-sense-of-kpi-changes","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/05\/06\/making-sense-of-kpi-changes\/","title":{"rendered":"Making Sense of KPI\u00a0Changes"},"content":{"rendered":"<p>    Making Sense of KPI\u00a0Changes<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p class=\"wp-block-paragraph\"><mdspan datatext=\"el1746494659669\" class=\"mdspan-comment\">As analysts<\/mdspan>, we are usually monitoring metrics. Quite often, metrics change. And when they do, it\u2019s our job to figure out what\u2019s going on: why did the conversion rate suddenly drop, or what is driving consistent revenue growth?<\/p>\n<p class=\"wp-block-paragraph\">I started my journey in data analytics as a <a href=\"https:\/\/towardsdatascience.com\/tag\/kpi\/\" title=\"Kpi\">Kpi<\/a> analyst. For almost three years, I\u2019d been doing root cause analysis and KPI deep dives nearly full-time. Even after moving to product analytics, I\u2019m still regularly investigating the KPI shifts. You could say I\u2019ve become quite the experienced analytics detective.<\/p>\n<p class=\"wp-block-paragraph\">The cornerstone of <a href=\"https:\/\/towardsdatascience.com\/tag\/root-cause-analysis\/\" title=\"Root Cause Analysis\">Root Cause Analysis<\/a> is usually slicing and dicing the data. Most often, figuring out what segments are driving the change will give you a clue to the root causes. So, in this article, I would like to share a framework for estimating how different segments contribute to changes in your key metric. We will put together a set of functions to slice and dice our data and identify the main drivers behind the metric\u2019s changes.<\/p>\n<p class=\"wp-block-paragraph\">However, in real life, before jumping into data crunching, it\u2019s important to understand the context:\u00a0<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Is the data complete, and can we compare recent periods to previous ones?<\/li>\n<li class=\"wp-block-list-item\">Are there any long-term trends and known seasonal effects we\u2019ve seen in the past?<\/li>\n<li class=\"wp-block-list-item\">Have we launched anything recently, or are we aware of any external events affecting our metrics, such as a competitor\u2019s marketing campaign or currency fluctuations?<\/li>\n<\/ul>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><em>I\u2019ve discussed such nuances in more detail in my previous article, <a href=\"https:\/\/towardsdatascience.com\/anomaly-root-cause-analysis-101-98f63dd12016\/\" target=\"_blank\" rel=\"noreferrer noopener\">\u201cRoot Cause Analysis 101\u201d<\/a>.<\/em><\/p>\n<\/blockquote>\n<h2 class=\"wp-block-heading\">KPI change framework<\/h2>\n<p class=\"wp-block-paragraph\">We encounter different metrics, and analysing their changes requires different approaches. Let\u2019s start by defining the two types of metrics we will be working with:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\n<strong>Simple metrics<\/strong> represent a single measure, for example, total revenue or the number of active users. Despite their simplicity, they are often used in product analytics. One of the common examples is the North Star metrics. Good North Star metric estimates the total value received by customers. For example, AirBnB might use nights booked, and WhatsApp might track messages sent. Both are simple metrics.<\/li>\n<\/ul>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><em>You can learn more about North Star Metrics from <a href=\"https:\/\/amplitude.com\/books\/north-star\" target=\"_blank\" rel=\"noreferrer noopener\">the Amplitude Playbook.<\/a><\/em><\/p>\n<\/blockquote>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">However, we can\u2019t avoid using <strong>compound or ratio metrics<\/strong>, like conversion rate or average revenue per user (ARPU). Such metrics help us track our product performance more precisely and isolate the impact of specific changes. For example, imagine your team is working on improving the registration page. They can potentially track the number of registered customers as their primary KPI, but it might be highly affected by external factors (i.e., a marketing campaign driving more traffic). A better metric for this case would be a conversion rate from landing on a registration page to completing it.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">We will use a fictional example to learn how to approach root cause analysis for different types of metrics. Imagine we are working on an e-commerce product, and our team is focused on two main KPIs:\u00a0<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\n<strong>total revenue<\/strong> (<em>a simple metric<\/em>),<\/li>\n<li class=\"wp-block-list-item\">\n<strong>conversion to purchase<\/strong>\u200a\u2014\u200athe ratio of users who made a purchase to the total number of users (<em>a ratio metric<\/em>).<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">We will use synthetic datasets to look at possible scenarios of metrics\u2019 changes. Now it\u2019s time to move on and see what\u2019s going on with the revenue.<\/p>\n<h2 class=\"wp-block-heading\">Analysis: simple metrics<\/h2>\n<p class=\"wp-block-paragraph\">Let\u2019s start simple and dig into the revenue changes. As usual, the first step is to load a dataset. Our data has two dimensions: country and maturity (whether a customer is new or existing). Additionally, we have three different scenarios to test our framework under various conditions.<\/p>\n<pre class=\"wp-block-prismatic-blocks\" datatext=\"el1746462665522\"><code class=\"language-julia\">import pandas as pd\ndf = pd.read_csv('absolute_metrics_example.csv', sep = 't')\ndf.head()<\/code><\/pre>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/1AvUSulLRtjigQex4MjoMLQ.png?ssl=1\" alt=\"\" class=\"wp-image-603332\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">The main goal of our analysis is to determine how each segment contributes to the change in our top-line metric. Let\u2019s break it down. We will write a bunch of formulas. But don\u2019t worry, it won\u2019t require any knowledge beyond basic arithmetic.<\/p>\n<p class=\"wp-block-paragraph\">First of all, it\u2019s helpful to see how the metric changed in each segment, both in absolute and relative numbers.\u00a0<\/p>\n<p class=\"wp-block-shortcode\">[textbf{difference}^{textsf{i}} = textbf{metric}_{textsf{before}}^textsf{i} \u2013 textbf{metric}_{textsf{after}}^textsf{i}\\<br \/>\ntextbf{difference_rate}^{textsf{i}}  = frac{textbf{difference}^{textsf{i}}}{textbf{metric}_{textsf{before}}^textsf{i}}]<\/p>\n<p class=\"wp-block-paragraph\">The next step is to look at it holistically and see how each segment contributed to the overall change in the metric. We will calculate the impact as the share of the total difference.<\/p>\n<p class=\"wp-block-shortcode\">[textbf{impact}^{textsf{i}}  = frac{textbf{difference}^{textsf{i}}}{sum_{textsf{i}}{textbf{difference}^{textsf{i}}}}]<\/p>\n<p class=\"wp-block-paragraph\">That already gives us some valuable insights. However, to understand whether any segment is behaving unusually and requires special attention, it\u2019s useful to compare the segment\u2019s contribution to the metric change with its initial share of the metric.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">Here\u2019s the reasoning. If the segment makes up 90% of our metric, then it\u2019s expected for it to contribute 85\u201395% of the change. But if a segment that accounts for only 10% ends up contributing 90% of the change, that\u2019s definitely an anomaly.<\/p>\n<p class=\"wp-block-paragraph\">To calculate it, we will simply normalise each segment\u2019s contribution to the metric by the initial segment size.\u00a0<\/p>\n<p class=\"wp-block-shortcode\">[textbf{segment_share}_{textsf{before}}^textsf{i} = frac{textbf{metric}_{textsf{before}}^textsf{i}}{sum_{textsf{i}}{textbf{metric}_{textsf{before}}^textsf{i}}}\\<br \/>\ntextbf{impact_normalised}^textsf{i} = frac{textbf{impact}^{textsf{i}}}{textbf{segment_share}_{textsf{before}}^textsf{i}}]<\/p>\n<p class=\"wp-block-paragraph\">That\u2019s it for the formulas. Now, let\u2019s write the code and see this approach in practice. It will be easier to understand how it works through practical examples.\u00a0<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-julia\">def calculate_simple_growth_metrics(stats_df):\n  # Calculating overall stats\n  before = stats_df.before.sum()\n  after = stats_df.after.sum()\n  print('Metric change: %.2f -&gt; %.2f (%.2f%%)' % (before, after, 100*(after - before)\/before))\n\n  # Estimating impact of each segment\n  stats_df['difference'] = stats_df.after - stats_df.before\n  stats_df['difference_rate'] = (100*stats_df.difference\/stats_df.before)\n    .map(lambda x: round(x, 2))\n  stats_df['impact'] = (100*stats_df.difference \/ stats_df.difference.sum())\n    .map(lambda x: round(x, 2))\n  stats_df['segment_share_before'] = (100* stats_df.before \/ stats_df.before.sum())\n    .map(lambda x: round(x, 2))\n  stats_df['impact_norm'] = (stats_df.impact\/stats_df.segment_share_before)\n    .map(lambda x: round(x, 2))\n\n  # Creating visualisations\n  create_parallel_coordinates_chart(stats_df.reset_index(), stats_df.index.name)\n  create_share_vs_impact_chart(stats_df.reset_index(), stats_df.index.name, 'segment_share_before', 'impact')\n  \n  return stats_df.sort_values('impact_norm', ascending = False)<\/code><\/pre>\n<p class=\"wp-block-paragraph\">I believe that visualisations are a crucial part of any data storytelling as visualisations help viewers grasp insights more quickly and intuitively. That\u2019s why I\u2019ve included a couple of charts in our function:\u00a0<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\n<strong>A parallel coordinates chart<\/strong> to show how the metric changed in each slice\u200a\u2014\u200athis visualisation will help us see the most significant drivers in absolute terms.<\/li>\n<li class=\"wp-block-list-item\">\n<strong>A scatter plot<\/strong> to compare each segment\u2019s impact on the KPI with the segment\u2019s initial size. This chart helps spot anomalies\u200a\u2014\u200asegments whose impact on the KPI is disproportionately large or small.<\/li>\n<\/ul>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><em>You can find the complete code for the visualisations on <a href=\"https:\/\/github.com\/miptgirl\/miptgirl_medium\/blob\/main\/growth_narrative_llm_agent\/growth_narrative_utils.py\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub<\/a>.<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">Now that we have all the tools in place to analyse revenue data, let\u2019s see how our framework performs in different scenarios.<\/p>\n<h4 class=\"wp-block-heading\">Scenario 1: Revenue dropped equally across all segments<\/h4>\n<p class=\"wp-block-paragraph\">Let\u2019s start with the first scenario. The analysis is very straightforward\u200a\u2014\u200awe just need to call the function defined above.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-julia\">calculate_simple_growth_metrics(\n  df.groupby('country')[['revenue_before', 'revenue_after_scenario_1']].sum()\n    .sort_values('revenue_before', ascending = False).rename(\n        columns = {'revenue_after_scenario_1': 'after', \n          'revenue_before': 'before'}\n    )\n)<\/code><\/pre>\n<p class=\"wp-block-paragraph\">In the output, we will get a table with detailed stats.<\/p>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/1kPwNy1zZGmyqlUiKJAGKdg.png?ssl=1\" alt=\"\" class=\"wp-image-603331\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">However, in my opinion, visualisations are more informative. It\u2019s obvious that revenue dropped by 30\u201340% in all countries, and there are no anomalies.\u00a0<\/p>\n<figure class=\"wp-block-image alignwide\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/1_eRkVZXjKCA8oJR7dSbkCw.png?ssl=1\" alt=\"\" class=\"wp-image-603328\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<h4 class=\"wp-block-heading\">Scenario 2: One or more segments drove the change<\/h4>\n<p class=\"wp-block-paragraph\">Let\u2019s check out another scenario by calling the same function.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-julia\">calculate_simple_growth_metrics(\n  df.groupby('country')[['revenue_before', 'revenue_after_scenario_2']].sum()\n    .sort_values('revenue_before', ascending = False).rename(\n        columns = {'revenue_after_scenario_2': 'after', \n          'revenue_before': 'before'}\n    )\n)<\/code><\/pre>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/1KMb53wuaGRPO4mzVSjv4Gg.png?ssl=1\" alt=\"\" class=\"wp-image-603337\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">We can see the biggest drop in both absolute and relative numbers in France. It\u2019s definitely an anomaly since it accounts for 99.9% of the total metric change. We can easily spot this in our visualisations.<\/p>\n<figure class=\"wp-block-image alignwide size-large\"><img data-recalc-dims=\"1\" height=\"370\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/image-27-1024x370.png?resize=1024%2C370&#038;ssl=1\" alt=\"\" class=\"wp-image-603322\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Also, it\u2019s worth going back to the first example. We looked at the metric split by country and found no specific segments driving changes. But digging a little bit deeper might help us understand what\u2019s going on. Let\u2019s try adding another layer and look at country and maturity.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-julia\">df['segment'] = df.country + ' - ' + df.maturity \ncalculate_simple_growth_metrics(\n    df.groupby(['segment'])[['revenue_before', 'revenue_after_scenario_1']].sum()\n        .sort_values('revenue_before', ascending = False).rename(\n            columns = {'revenue_after_scenario_1': 'after', 'revenue_before': 'before'}\n        )\n)<\/code><\/pre>\n<p class=\"wp-block-paragraph\">Now, we can see that the change is mostly driven by new users across the countries. These charts clearly highlight issues with the new customer experience and give you a clear direction for further investigation.<\/p>\n<figure class=\"wp-block-image alignwide\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/1kA93mZEPJXewJn__iW94kQ.png?ssl=1\" alt=\"\" class=\"wp-image-603339\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<h4 class=\"wp-block-heading\">Scenario 3: Volume shifting between segments<\/h4>\n<p class=\"wp-block-paragraph\">Finally, let\u2019s explore the last scenario for revenue.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-julia\">calculate_simple_growth_metrics(\n    df.groupby(['segment'])[['revenue_before', 'revenue_after_scenario_3']].sum()\n        .sort_values('revenue_before', ascending = False).rename(\n            columns = {'revenue_after_scenario_3': 'after', 'revenue_before': 'before'}\n        )\n)<\/code><\/pre>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/1xyqT2inqwNVm2TiiFebn2Q.png?ssl=1\" alt=\"\" class=\"wp-image-603335\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">We can clearly see that France is the biggest anomaly\u200a\u2014\u200arevenue in France has dropped, and this change is correlated with the top-line revenue drop. However, there is another outstanding segment\u200a\u2014\u200aSpain. In Spain, revenue has increased significantly.<\/p>\n<p class=\"wp-block-paragraph\">This pattern raises a suspicion that some of the revenue from France might have shifted to Spain. However, we still see a decline in the top-line metric, so it\u2019s worth further investigation. Practically, this situation could be caused by data issues, logging errors or service unavailability in some regions (so customers have to use VPNs and appear with a different country in our logs).<\/p>\n<figure class=\"wp-block-image alignwide\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/1cMlgFcD96Wn2FGe5-2xpKg.png?ssl=1\" alt=\"\" class=\"wp-image-603333\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">We\u2019ve looked at a bunch of different examples, and our framework helped us find the main drivers of change. I hope it\u2019s now clear how to conduct root cause analysis with simple metrics, and we are ready to move on to ratio metrics.<\/p>\n<h2 class=\"wp-block-heading\">Analysis: ratio metrics<\/h2>\n<p class=\"wp-block-paragraph\">Product metrics are often ratios like average revenue per customer or conversion. Let\u2019s see how we can break down changes in this type of metrics. In our case, we will look at conversion.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">There are two types of effects to consider when analysing ratio metrics:\u00a0<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\n<strong>Change within a segment<\/strong>, for example, if customer conversion in France drops, the overall conversion will also drop.<\/li>\n<li class=\"wp-block-list-item\">\n<strong>Change in the mix<\/strong>, for example, if the share of new customers increases, and new users typically convert at a lower rate, this shift in the mix can also lead to a drop in the overall conversion rate.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">To understand what\u2019s going on, we need to be able to distinguish these effects. Once again, we will write a bunch of formulas to break down and quantify each type of impact.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">Let\u2019s start by defining some useful variables.<\/p>\n<p class=\"wp-block-shortcode\">[<br \/>\ntextbf{c}_{textsf{before}}^{textsf{i}}, textbf{c}_{textsf{after}}^{textsf{i}} \u2013 textsf{converted users}\\<br \/>\ntextbf{C}_{textsf{before}}^{textsf{total}} = sum_{textsf{i}}{textbf{c}_{textsf{before}}^{textsf{i}}}\\<br \/>\ntextbf{C}_{textsf{after}}^{textsf{total}} = sum_{textsf{i}}{textbf{c}_{textsf{after}}^{textsf{i}}}\\<br \/>\ntextbf{t}_{textsf{before}}^{textsf{i}}, textbf{t}_{textsf{after}}^{textsf{i}} \u2013 textsf{total users}\\<br \/>\ntextbf{T}_{textsf{before}}^{textsf{total}} = sum_{textsf{i}}{textbf{t}_{textsf{before}}^{textsf{i}}}\\<br \/>\ntextbf{T}_{textsf{after}}^{textsf{total}} = sum_{textsf{i}}{textbf{t}_{textsf{after}}^{textsf{i}}}<br \/>\n]<\/p>\n<p class=\"wp-block-paragraph\">Next, let\u2019s talk about the impact of <strong>the change in mix<\/strong>. To isolate this effect, we will estimate how the overall conversion rate would change<em> <\/em>if conversion rates within all segments remained constant, and the absolute numbers for both converted and total users in all other segments stayed fixed. The only variables we will change are the total and converted number of users in segment <em>i<\/em>. We will adjust it to reflect its new share in the overall population.<\/p>\n<p class=\"wp-block-paragraph\">Let\u2019s start by calculating how the total number of users in our segment needs to change to match the target segment share.\u00a0<\/p>\n<p class=\"wp-block-shortcode\">[<br \/>\nfrac{textbf{t}_{textsf{after}}^{textsf{i}}}{textbf{T}_{textsf{after}}^{textsf{total}}} = frac{textbf{t}_{textsf{before}}^{textsf{i}} + deltatextbf{t}^{textsf{i}}}{textbf{T}_{textsf{before}}^{textsf{total}}+ deltatextbf{t}^{textsf{i}}} \\<br \/>\ndeltatextbf{t}^{textsf{i}} = frac{textbf{T}_{textsf{before}}^{textsf{total}} * textbf{t}_{textsf{after}}^{textsf{i}} \u2013 textbf{T}_{textsf{after}}^{textsf{total}} * textbf{t}_{textsf{before}}^{textsf{i}}}{textbf{T}_{textsf{after}}^{textsf{total}} \u2013 textbf{t}_{textsf{after}}^{textsf{i}}}<br \/>\n]<\/p>\n<p class=\"wp-block-paragraph\">Now, we can estimate the change in mix impact using the following formula.\u00a0<\/p>\n<p class=\"wp-block-shortcode\">[<br \/>\ntextbf{change in mix impact} = frac{textbf{C}_{textsf{before}}^{textsf{total}} + deltatextbf{t}^{textsf{i}} * frac{textbf{c}_{textsf{before}}^{textsf{i}}}{textbf{t}_{textsf{before}}^{textsf{i}}}}{textbf{T}_{textsf{before}}^{textsf{total}} + deltatextbf{t}^{textsf{i}}} \u2013 frac{textbf{C}_{textsf{before}}^{textsf{total}}}{textbf{T}_{textsf{before}}^{textsf{total}}}<br \/>\n]<\/p>\n<p class=\"wp-block-paragraph\">The next step is to estimate <strong>the impact of the conversion rate change within segment <em>i<\/em><\/strong>. To isolate this effect, we will keep the total number of customers and converted customers in all other segments fixed. We will only change the number of converted users in segment <em>i<\/em> to match the target conversion rate at a new point.\u00a0<\/p>\n<p class=\"wp-block-shortcode\">[<br \/>\ntextbf{change within segment impact} = frac{textbf{C}_{textsf{before}}^{textsf{total}} + textbf{t}_{textsf{before}}^{textsf{i}} * frac{textbf{c}_{textsf{after}}^{textsf{i}}}{textbf{t}_{textsf{after}}^{textsf{i}}} \u2013 textbf{c}_{textsf{before}}^{textsf{i}}}{textbf{T}_{textsf{before}}^{textsf{total}}} \u2013 frac{textbf{C}_{textsf{before}}^{textsf{total}}}{textbf{T}_{textsf{before}}^{textsf{total}}} \\ = frac{textbf{t}_{textsf{before}}^{textsf{i}} * textbf{c}_{textsf{after}}^{textsf{i}} \u2013 textbf{t}_{textsf{after}}^{textsf{i}} * textbf{c}_{textsf{before}}^{textsf{i}}}{textbf{T}_{textsf{before}}^{textsf{total}} * textbf{t}_{textsf{after}}^{textsf{i}}}<br \/>\n]<\/p>\n<p class=\"wp-block-paragraph\">We can\u2019t simply sum the different types of effects because their relationship is not linear. That\u2019s why we also need to estimate the combined impact for the segment. This will combine the two formulas above, assuming that we will match both the new conversion rate within segment <em>i<\/em> and the new segment share.<\/p>\n<p class=\"wp-block-shortcode\">[<br \/>\ntextbf{total segment change} = frac{textbf{C}_{textsf{before}}^{textsf{total}} \u2013 textbf{c}_{textsf{before}}^{textsf{i}} + (textbf{t}_{textsf{before}}^{textsf{i}} + deltatextbf{t}^{textsf{i}}) * frac{textbf{c}_{textsf{after}}^{textsf{i}}}{textbf{t}_{textsf{after}}^{textsf{i}}}}{textbf{T}_{textsf{before}}^{textsf{total}} + deltatextbf{t}^{textsf{i}}}  \u2013 frac{textbf{C}_{textsf{before}}^{textsf{total}}}{textbf{T}_{textsf{before}}^{textsf{total}}}<br \/>\n]<\/p>\n<p class=\"wp-block-paragraph\">It\u2019s worth noting that these effect estimations are not 100% accurate (i.e. we can\u2019t sum them up directly). However, they are precise enough to make decisions and identify the main drivers of the change.<\/p>\n<p class=\"wp-block-paragraph\">The next step is to put everything into code. We will again leverage visualisations: correlation and parallel coordinates charts that we\u2019ve already used for simple metrics, along with a couple of waterfall charts to break down impact by segments.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-julia\">def calculate_conversion_effects(df, dimension, numerator_field1, denominator_field1, \n                       numerator_field2, denominator_field2):\n  cmp_df = df.groupby(dimension)[[numerator_field1, denominator_field1, numerator_field2, denominator_field2]].sum()\n  cmp_df = cmp_df.rename(columns = {\n      numerator_field1: 'c1', \n      numerator_field2: 'c2',\n      denominator_field1: 't1', \n      denominator_field2: 't2'\n  })\n    \n  cmp_df['conversion_before'] = cmp_df['c1']\/cmp_df['t1']\n  cmp_df['conversion_after'] = cmp_df['c2']\/cmp_df['t2']\n  \n  C1 = cmp_df['c1'].sum()\n  T1 = cmp_df['t1'].sum()\n  C2 = cmp_df['c2'].sum()\n  T2 = cmp_df['t2'].sum()\n\n  print('conversion before = %.2f' % (100*C1\/T1))\n  print('conversion after = %.2f' % (100*C2\/T2))\n  print('total conversion change = %.2f' % (100*(C2\/T2 - C1\/T1)))\n  \n  cmp_df['dt'] = (T1*cmp_df.t2 - T2*cmp_df.t1)\/(T2 - cmp_df.t2)\n  cmp_df['total_effect'] = (C1 - cmp_df.c1 + (cmp_df.t1 + cmp_df.dt)*cmp_df.conversion_after)\/(T1 + cmp_df.dt) - C1\/T1\n  cmp_df['mix_change_effect'] = (C1 + cmp_df.dt*cmp_df.conversion_before)\/(T1 + cmp_df.dt) - C1\/T1\n  cmp_df['conversion_change_effect'] = (cmp_df.t1*cmp_df.c2 - cmp_df.t2*cmp_df.c1)\/(T1 * cmp_df.t2)\n  \n  for col in ['total_effect', 'mix_change_effect', 'conversion_change_effect', 'conversion_before', 'conversion_after']:\n      cmp_df[col] = 100*cmp_df[col]\n        \n  cmp_df['conversion_diff'] = cmp_df.conversion_after - cmp_df.conversion_before\n  cmp_df['before_segment_share'] = 100*cmp_df.t1\/T1\n  cmp_df['after_segment_share'] = 100*cmp_df.t2\/T2\n  for p in ['before_segment_share', 'after_segment_share', 'conversion_before', 'conversion_after', 'conversion_diff',\n                   'total_effect', 'mix_change_effect', 'conversion_change_effect']:\n      cmp_df[p] = cmp_df[p].map(lambda x: round(x, 2))\n  cmp_df['total_effect_share'] = 100*cmp_df.total_effect\/(100*(C2\/T2 - C1\/T1))\n  cmp_df['impact_norm'] = cmp_df.total_effect_share\/cmp_df.before_segment_share\n\n  # creating visualisations\n  create_share_vs_impact_chart(cmp_df.reset_index(), dimension, 'before_segment_share', 'total_effect_share')\n  cmp_df = cmp_df[['t1', 't2', 'before_segment_share', 'after_segment_share', 'conversion_before', 'conversion_after', 'conversion_diff',\n                   'total_effect', 'mix_change_effect', 'conversion_change_effect', 'total_effect_share']]\n\n  plot_conversion_waterfall(\n      100*C1\/T1, 100*C2\/T2, cmp_df[['total_effect']].rename(columns = {'total_effect': 'effect'})\n  )\n\n  # putting together effects split by change of mix and conversion change\n  tmp = []\n  for rec in cmp_df.reset_index().to_dict('records'): \n    tmp.append(\n      {\n          'segment': rec[dimension] + ' - change of mix',\n          'effect': rec['mix_change_effect']\n      }\n    )\n    tmp.append(\n      {\n        'segment': rec[dimension] + ' - conversion change',\n        'effect': rec['conversion_change_effect']\n      }\n    )\n  effects_det_df = pd.DataFrame(tmp)\n  effects_det_df['effect_abs'] = effects_det_df.effect.map(lambda x: abs(x))\n  effects_det_df = effects_det_df.sort_values('effect_abs', ascending = False) \n  top_effects_det_df = effects_det_df.head(5).drop('effect_abs', axis = 1)\n  plot_conversion_waterfall(\n    100*C1\/T1, 100*C2\/T2, top_effects_det_df.set_index('segment'),\n    add_other = True\n  )\n\n  create_parallel_coordinates_chart(cmp_df.reset_index(), dimension, before_field='before_segment_share', \n    after_field='after_segment_share', impact_norm_field = 'impact_norm', \n    metric_name = 'share of segment', show_mean = False)\n  create_parallel_coordinates_chart(cmp_df.reset_index(), dimension, before_field='conversion_before', \n    after_field='conversion_after', impact_norm_field = 'impact_norm', \n    metric_name = 'conversion', show_mean = False)\n\n  return cmp_df.rename(columns = {'t1': 'total_before', 't2': 'total_after'})<\/code><\/pre>\n<p class=\"wp-block-paragraph\">With that, we\u2019re done with the theory and ready to apply this framework in practice. We\u2019ll load another dataset that includes a couple of scenarios.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-julia\">conv_df = pd.read_csv('conversion_metrics_example.csv', sep = 't')\nconv_df.head()<\/code><\/pre>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" height=\"166\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/image-28-1024x166.png?resize=1024%2C166&#038;ssl=1\" alt=\"\" class=\"wp-image-603323\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<h4 class=\"wp-block-heading\">Scenario 1: Uniform conversion uplift<\/h4>\n<p class=\"wp-block-paragraph\">We will again just call the function above and analyse the results.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-julia\">calculate_conversion_effects(\n    conv_df, 'country', 'converted_users_before', 'users_before', \n    'converted_users_after_scenario_1', 'users_after_scenario_1',\n)<\/code><\/pre>\n<p class=\"wp-block-paragraph\">The first scenario is pretty straightforward: conversion has increased in all countries by 4\u20137% points, resulting in the top-line conversion increase as well.\u00a0<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" height=\"226\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/image-29-1024x226.png?resize=1024%2C226&#038;ssl=1\" alt=\"\" class=\"wp-image-603324\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">We can see that there are no anomalies in segments: the impact is correlated with the segment share, and conversion has increased uniformly across all countries.\u00a0<\/p>\n<figure class=\"wp-block-image alignwide size-large\"><img data-recalc-dims=\"1\" height=\"376\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/image-30-1024x376.png?resize=1024%2C376&#038;ssl=1\" alt=\"\" class=\"wp-image-603325\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/1OgJLCcUA3s9Xe9rB7XhxBg.png?ssl=1\" alt=\"\" class=\"wp-image-603329\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">We can look at the waterfall charts to see the change split by countries and types of effects. Even though effect estimations are not additive, we can still use them to compare the impacts of different slices.<\/p>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/1vY6ZXW6liTEKkCHzOAGa1Q.png?ssl=1\" alt=\"\" class=\"wp-image-603336\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">The suggested framework has been quite helpful. We were able to quickly figure out what\u2019s going on with the metrics.<\/p>\n<h4 class=\"wp-block-heading\">Scenario 2: Simpson\u2019s paradox<\/h4>\n<p class=\"wp-block-paragraph\">Let\u2019s take a look at a slightly trickier case.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-julia\">calculate_conversion_effects(\n    conv_df, 'country', 'converted_users_before', 'users_before', \n    'converted_users_after_scenario_2', 'users_after_scenario_2',\n)<\/code><\/pre>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/15r7zncZ7ZSY5QSVkVz2qdQ.png?ssl=1\" alt=\"\" class=\"wp-image-603330\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">The story is more complicated here:\u00a0<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">The share of UK users has increased while conversion in this segment has dropped significantly, from 74.9% to 34.8%.<\/li>\n<li class=\"wp-block-list-item\">In all other countries, conversion has increased by 8\u201311% points.<\/li>\n<\/ul>\n<figure class=\"wp-block-image alignwide\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/1XPRksbj0j1DsF1xH6Xhwtw.png?ssl=1\" alt=\"\" class=\"wp-image-603340\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Unsurprisingly, the conversion change in the UK is the biggest driver of the top-line metric decline.<\/p>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/1t4G0P2GnRxpxcXBpcYxgSQ.png?ssl=1\" alt=\"\" class=\"wp-image-603338\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Here we can see an example of non-linearity: 10% of effects are not explained by the current split. Let\u2019s dig one level deeper and add a maturity dimension. This reveals the true story:\u00a0<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Conversion has actually increased uniformly by around 10% points in all segments, yet the top-line metric has still dropped.\u00a0<\/li>\n<li class=\"wp-block-list-item\">The main reason is the increase in the share of new users in the UK, as these customers have a significantly lower conversion rate than average.<\/li>\n<\/ul>\n<figure class=\"wp-block-image alignwide\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/15TksaS-SevsbXPjGGZz72w.png?ssl=1\" alt=\"\" class=\"wp-image-603334\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Here is the split of effects by segments.<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" height=\"662\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/image-31-1024x662.png?resize=1024%2C662&#038;ssl=1\" alt=\"\" class=\"wp-image-603326\"><figcaption class=\"wp-element-caption\">Image by author<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">This counterintuitive effect is called <a href=\"https:\/\/en.wikipedia.org\/wiki\/Simpson%27s_paradox\" rel=\"noreferrer noopener\" target=\"_blank\">Simpson\u2019s paradox<\/a>. A classic example of Simpson\u2019s paradox comes from a 1973 study on graduate school admissions at Berkeley. At first, it seemed like men had a higher chance of getting in than women. However, when they looked at the departments people were applying to, it turned out women were applying to more competitive departments with lower admission rates, while men tended to apply to less competitive ones. When they added department as a confounder, the data actually showed a small but significant bias in favour of women.<\/p>\n<p class=\"wp-block-paragraph\">As always, visualisation can give you a bit of intuition on how this paradox works.<\/p>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/05\/00RUik4lFTSxfCeOT.gif?ssl=1\" alt=\"\" class=\"wp-image-603341\"><figcaption class=\"wp-element-caption\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Simpson%27s_paradox#\/media\/File:Simpsons_paradox_-_animation.gif\" target=\"_blank\" rel=\"noreferrer noopener\">source<\/a> | licence <a href=\"https:\/\/creativecommons.org\/licenses\/by-sa\/4.0\" target=\"_blank\" rel=\"noreferrer noopener\">CC BY-SA 4.0<\/a><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">That\u2019s it. We\u2019ve learned how to break down the changes in ratio metrics.<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><em>You can find the complete code and data on <a href=\"https:\/\/github.com\/miptgirl\/miptgirl_medium\/tree\/main\/growth_narrative_llm_agent\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub<\/a>.<\/em><\/p>\n<\/blockquote>\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n<p class=\"wp-block-paragraph\">It\u2019s been a long journey, so let\u2019s quickly recap what we\u2019ve covered in this article:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">We\u2019ve identified two major types of metrics: simple metrics (like revenue or number of users) and ratio metrics (like conversion rate or ARPU).\u00a0<\/li>\n<li class=\"wp-block-list-item\">For each metric type, we\u2019ve learned how to break down the changes and identify the main drivers. We\u2019ve put together a set of functions that can help you find the answers with just a couple of function calls.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">With this practical framework, you\u2019re now fully equipped to conduct root cause analysis for any metric. However, there is still room for improvement in our solution. In my next article, I will explore how to build an LLM agent that will do the whole analysis and summary for us. Stay tuned!<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><em>Thank you a lot for reading this article. I hope this article was insightful for you. <\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">\n<p>The post <a href=\"https:\/\/towardsdatascience.com\/making-sense-of-kpi-changes\/\">Making Sense of KPI\u00a0Changes<\/a> appeared first on <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a>.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Mariya Mansurova<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/making-sense-of-kpi-changes\/\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Making Sense of KPI\u00a0Changes As analysts, we are usually monitoring metrics. Quite often, metrics change. And when they do, it\u2019s our job to figure out what\u2019s going on: why did the conversion rate suddenly drop, or what is driving consistent revenue growth? I started my journey in data analytics as a Kpi analyst. For almost [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,2334,83,240,2570,157,2571],"tags":[2573,2572,1195],"class_list":["post-3592","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-data-mining","category-data-science","category-editors-pick","category-kpi","category-python","category-root-cause-analysis","tag-changes","tag-kpi","tag-metrics"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/3592"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=3592"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/3592\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=3592"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=3592"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=3592"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}