{"id":1873,"date":"2025-02-15T07:02:21","date_gmt":"2025-02-15T07:02:21","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/02\/15\/start-asking-your-data-why-a-gentle-intro-to-causality\/"},"modified":"2025-02-15T07:02:21","modified_gmt":"2025-02-15T07:02:21","slug":"start-asking-your-data-why-a-gentle-intro-to-causality","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/02\/15\/start-asking-your-data-why-a-gentle-intro-to-causality\/","title":{"rendered":"\u27a1\ufe0f Start Asking Your Data \u2018Why?\u2019 \u2014 A Gentle Intro To Causality"},"content":{"rendered":"<p>    \u27a1\ufe0f Start Asking Your Data \u2018Why?\u2019 \u2014 A Gentle Intro To Causality<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p class=\"wp-block-paragraph\">Correlation does not imply causation. It turns out, however, that with some simple ingenious tricks one can, potentially, unveil causal relationships within standard observational data, without having to resort to expensive randomised control trials.<\/p>\n<p class=\"wp-block-paragraph\">This post is targeted towards anyone making data driven decisions. The main takeaway message is that causality may be possible by understanding that the <strong>story behind the data<\/strong> is as important as the data itself.<\/p>\n<p class=\"wp-block-paragraph\">By introducing <em>Simpson\u2019s<\/em> and <em>Berkson\u2019s<\/em> <em>Paradoxes<\/em>, situations where the outcome of a population is in conflict with that of its cohorts, I shine a light on the importance of using causal reasoning to identify these paradoxes in data and avoid misinterpretation. Specifically I introduce <em>causal graphs <\/em>as a method to visualise the story behind the data point out that by adding this to your arsenal you are likely to conduct better analyses and experiments.<\/p>\n<p class=\"wp-block-paragraph\">My ultimate objective is to whet your appetite to explore more on causality, as I believe that by asking data <em>\u201cWhy?\u201d<\/em> you will be able to go beyond correlation calculations and extract more insights, as well as avoid common misjudgement pitfalls.<\/p>\n<p class=\"wp-block-paragraph\">Note that throughout this gentle intro I do not use equations but demonstrate using accessible intuitive visuals. That said I provide resources for you to take your next step in adding <a href=\"https:\/\/towardsdatascience.com\/tag\/causal-inference\/\" title=\"Causal Inference\">Causal Inference<\/a> to your statistical toolbox so that you may get more value from your data.<\/p>\n<h2 class=\"wp-block-heading\"><strong>The Era of Data Driven Decision Making<\/strong><\/h2>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><em>In [Deity] We Trust, All Others Bring Data! \u2014 William E. Deming<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">In this digital age it is common to put a lot of faith in data. But this raises an overlooked question: Should we trust data on its own?<\/p>\n<p class=\"wp-block-paragraph\">Judea Pearl, who is considered the godfather of <a href=\"https:\/\/towardsdatascience.com\/tag\/causality\/\" title=\"Causality\">Causality<\/a>, articulated best:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><em>\u201cThe collection of information is as important as the information itself \u201c \u2014 Judea Pearl<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">In other words <strong>the story behind <\/strong>the data is as important as the data itself.<\/p>\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><img data-recalc-dims=\"1\" loading=\"lazy\" data-dominant-color=\"987f76\" data-has-transparency=\"true\" decoding=\"async\" width=\"828\" height=\"860\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.02%25E2%2580%25AFPM.png?resize=828%2C860&#038;ssl=1\" alt=\"\" class=\"wp-image-597965 has-transparency\" style=\"--dominant-color: #987f76; width:294px;height:auto\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.02\u202fPM.png 828w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.02\u202fPM-289x300.png 289w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.02\u202fPM-768x798.png 768w\" sizes=\"(max-width: 828px) 100vw, 828px\"><figcaption class=\"wp-element-caption\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Judea_Pearl\">Judea Pearl<\/a> is considered the Godfather of Causality. Credit: <a href=\"https:\/\/medium.com\/u\/f390f1bdd353?source=post_page---user_mention--47142d5ec96a---------------------------------------\">Aleksander Molak<\/a><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">This manifests in a growing awareness of the importance of identifying bias in datasets. By the end of this post I hope that you will appreciate that causality pertains the fundamental tools to best express, quantify and attempt to correct for these biases.<\/p>\n<p class=\"wp-block-paragraph\">In causality introductions it is customary to demonstrate why \u201ccorrelation does not imply causation\u201d by highlighting limitations of association analysis due to spurious correlations (e.g, shark attacks <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f988.png?ssl=1\" alt=\"\ud83e\udd88\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> and ice-cream sales <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f366.png?ssl=1\" alt=\"\ud83c\udf66\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\">). In an attempt to reduce the length of this post I defer this aspect to <a href=\"https:\/\/bit.ly\/start-ask-why-post2\">an older one of mine<\/a>. Here I focus on two mind boggling paradoxes <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f92f.png?ssl=1\" alt=\"\ud83e\udd2f\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> and their resolution via <em>causal graphs <\/em>to make a similar point.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Paradoxes in Analysis<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">To understand the importance of the story behind the data we will examine two counter-intuitive (but nonetheless true) paradoxes which are classical situations of data misinterpretation.<\/p>\n<p class=\"wp-block-paragraph\">In the first we imagine a clinical trial in which patients are given a treatment and that results in a health score. Our objective is to assess the average impact of increased treatment to the health outcome. For pedagogical purposes in these examples we assume that samples are representative (i.e, the sample size is not an issue) and that variances in measurements are minimal.<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" data-dominant-color=\"edeaea\" data-has-transparency=\"true\" style=\"--dominant-color: #edeaea;\" decoding=\"async\" width=\"1024\" height=\"872\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.16%25E2%2580%25AFPM-1024x872.png?resize=1024%2C872&#038;ssl=1\" alt=\"\" class=\"wp-image-597966 has-transparency\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.16\u202fPM-1024x872.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.16\u202fPM-300x255.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.16\u202fPM-768x654.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.16\u202fPM.png 1400w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">Population outcome of imaginary clinical trial. Each dot is one patient and the red line indicates the na\u00efve population trend.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">In the figure above we learn that on average increasing the treatment appears to be beneficial since it results in a better outcome.<\/p>\n<p class=\"wp-block-paragraph\">Now we\u2019ll color code by age and gender groupings and examine how the treatment increases impacts each cohort.<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"f1eae8\" data-has-transparency=\"true\" style=\"--dominant-color: #f1eae8;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"871\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.30%25E2%2580%25AFPM-1024x871.png?resize=1024%2C871&#038;ssl=1\" alt=\"\" class=\"wp-image-597967 has-transparency\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.30\u202fPM-1024x871.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.30\u202fPM-300x255.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.30\u202fPM-768x653.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.30\u202fPM.png 1460w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">Same data as before where each symbol represents an age-gender cohort.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Track any cohort (e.g, \u201cGirls\u201d representing young females) and you immediately realise that increase in treatment appears adverse.<\/p>\n<p class=\"wp-block-paragraph\">What is the conclusion of the study? On the one hand increasing the treatment appears to be better for the population at large, but when examining gender-age cohorts it seems disadvantageous. This is Simpson\u2019s Paradox which may be stated:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><em>\u201cTrends can exist in subgroups but reverse for the whole\u201d<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">Below we will resolve this paradox using causality tools, but beforehand let\u2019s explore another interesting one, which also examines made up data.<\/p>\n<p class=\"wp-block-paragraph\">Imagine that we quantify for the general population their attractiveness and how talented they are as in this figure:<\/p>\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" data-dominant-color=\"efeff2\" data-has-transparency=\"true\" style=\"--dominant-color: #efeff2;\" loading=\"lazy\" decoding=\"async\" width=\"676\" height=\"600\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.45%25E2%2580%25AFPM.png?resize=676%2C600&#038;ssl=1\" alt=\"\" class=\"wp-image-597954 has-transparency\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.45\u202fPM.png 676w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.45\u202fPM-300x266.png 300w\" sizes=\"auto, (max-width: 676px) 100vw, 676px\"><figcaption class=\"wp-element-caption\">General population. Source: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Berkson%27s_paradox\">Wikipedia<\/a>, created by <a href=\"https:\/\/commons.wikimedia.org\/wiki\/User:Cmglee\">Cmglee<\/a><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">We find no apparent correlation.<\/p>\n<p class=\"wp-block-paragraph\">Now we\u2019ll focus on an unusual subset \u2014 famous people:<\/p>\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" data-dominant-color=\"efeff1\" data-has-transparency=\"true\" style=\"--dominant-color: #efeff1;\" loading=\"lazy\" decoding=\"async\" width=\"866\" height=\"740\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.52%25E2%2580%25AFPM.png?resize=866%2C740&#038;ssl=1\" alt=\"\" class=\"wp-image-597957 has-transparency\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.52\u202fPM.png 866w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.52\u202fPM-300x256.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.52\u202fPM-768x656.png 768w\" sizes=\"auto, (max-width: 866px) 100vw, 866px\"><figcaption class=\"wp-element-caption\">A subset of celebrities. Source: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Berkson%27s_paradox\">Wikipedia<\/a> created by <a href=\"https:\/\/en.wikipedia.org\/wiki\/Berkson%27s_paradox\">Cmglee<\/a><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Here we clearly see an anti-correlation that doesn\u2019t exist in the general population.<\/p>\n<p class=\"wp-block-paragraph\">Should we conclude that <em>Talent<\/em> and <em>Attractiveness<\/em> are independent variables as per the first plot of the general population or that they are correlated as per that of celebrities?<\/p>\n<p class=\"wp-block-paragraph\">This is Berkson\u2019s Paradox where one population has a trait trend that another lacks.<\/p>\n<p class=\"wp-block-paragraph\">Whereas an algorithm would identify these correlations, resolving these paradoxes requires a full understanding of the context which normally is not fed to a computer. In other words without knowing the story behind the data results may be <strong>misinterpreted<\/strong> and <strong>wrong conclusions<\/strong> may be inferred.<\/p>\n<p class=\"wp-block-paragraph\">Mastering identification and resolution these paradoxes is an important <strong>first step<\/strong> to elevating one\u2019s analyses from correlations to <strong>causal inference<\/strong>.<\/p>\n<p class=\"wp-block-paragraph\">Whereas these simple examples may be explained away logically, for the purposes of learning causal tools in the next section I\u2019ll introduce <em>Causal Graphs<\/em>.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Causal Graphs\u2014 Visualising The Story Behind The Data<\/strong><\/h2>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><em>\u201c[From the Simpson\u2019s and Berkson\u2019s Paradoxes we learn that] <\/em><strong><em>certain decisions<\/em><\/strong><em> <\/em><strong><em>cannot be made<\/em><\/strong><em> based on the basis of<\/em><strong><em> data alone<\/em><\/strong><em>, but instead depend on the <\/em><strong><em>story behind the data<\/em><\/strong><em>. \u2026 Graph Theory enables these stories to be conveyed\u201d \u2014 Judea Pearl<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">Causal graph models are probabilistic graphical models used to visualise the story behind the data. They are perhaps one of the most powerful tools for analysts that is not taught in most statistics curricula. They are both elegant and highly informative. Hopefully by the end of this post you will appreciate it when Judea Pearl says that this is the missing vocabulary to communicate causality.<\/p>\n<p class=\"wp-block-paragraph\">To understand causal graph models (or causal graphs for short) we start with the following illustration of an example <em>undirected graph<\/em> with four nodes\/vertices and three edges.<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"f2f2f2\" data-has-transparency=\"false\" style=\"--dominant-color: #f2f2f2;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"623\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.09%25E2%2580%25AFPM-1024x623.png?resize=1024%2C623&#038;ssl=1\" alt=\"\" class=\"wp-image-597958 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.09\u202fPM-1024x623.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.09\u202fPM-300x183.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.09\u202fPM-768x468.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.09\u202fPM.png 1416w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">An undirected graph with four nodes\/vertices and three edges<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Each node is a variable and the edges communicate \u201cwho is <strong>related<\/strong> to whom?\u201d (i.e, correlations, joint probabilities).A <em>directed graph<\/em> is one in which we add arrows as in this figure.<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"ececec\" data-has-transparency=\"false\" style=\"--dominant-color: #ececec;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"943\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.17%25E2%2580%25AFPM-1024x943.png?resize=1024%2C943&#038;ssl=1\" alt=\"\" class=\"wp-image-597959 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.17\u202fPM-1024x943.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.17\u202fPM-300x276.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.17\u202fPM-768x707.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.17\u202fPM.png 1142w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">A directed graph with four nodes\/vertices and five directed edges<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">A directed edge communicates \u201cwho <strong>listens<\/strong> to whom?\u201d which is the essence of causation.<\/p>\n<p class=\"wp-block-paragraph\">In this specific example you can notice a cyclical relationship between the C and D nodes.A useful subset of directed graphs are the <em>directed acyclic graphs<\/em> (DAG), which have no cycles as in the next figure.<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"ededed\" data-has-transparency=\"false\" style=\"--dominant-color: #ededed;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"949\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.25%25E2%2580%25AFPM-1024x949.png?resize=1024%2C949&#038;ssl=1\" alt=\"\" class=\"wp-image-597960 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.25\u202fPM-1024x949.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.25\u202fPM-300x278.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.25\u202fPM-768x712.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.25\u202fPM.png 1122w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">A directed acyclic graph with four nodes\/vertices and four directed edges<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Here we see that when starting from any node (e.g, A) there isn\u2019t a path that gets back to it.<\/p>\n<p class=\"wp-block-paragraph\">DAGs are the go-to choice in causality for simplicity as the fact that parameters do not have feedback highly simplifies the flow of information. (For mechanisms that have feedback, e.g temporal systems, one may consider rolling out nodes as a function of time, but that is beyond the scope of this intro.)<\/p>\n<p class=\"wp-block-paragraph\">Causal graphs are powerful at conveying the cause\/effect relationships between the parameter and hence how data was generated (the story behind the data).<\/p>\n<p class=\"wp-block-paragraph\">From a practical point of view, graphs enable us to understand which parameters are confounders that need to be controlled for, and, as important, which not to control for, because doing so causes spurious correlations. This will be demonstrated below.<\/p>\n<p class=\"wp-block-paragraph\">The practice of attempting to build a causal graph enables:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Design of better experiments.<\/li>\n<li class=\"wp-block-list-item\">Draw causal conclusions (go beyond correlations by means of representing interventions, counterfactuals and encoding conditional independence relationships; all beyond the scope of this post).<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">To further motivate the usage of causal graph models we will use them to resolve the Simpson\u2019s and Berkson\u2019s paradoxes introduced above.<\/p>\n<h2 class=\"wp-block-heading\"><strong><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f48a.png?ssl=1\" alt=\"\ud83d\udc8a\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> Causal Graph Resolution of Simpson\u2019s Paradox<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">For simplicity we\u2019ll examine Simpson\u2019s paradox focusing on two cohorts, male and female adults.<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"f3f1f1\" data-has-transparency=\"true\" style=\"--dominant-color: #f3f1f1;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"864\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.37%25E2%2580%25AFPM-1024x864.png?resize=1024%2C864&#038;ssl=1\" alt=\"\" class=\"wp-image-597961 has-transparency\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.37\u202fPM-1024x864.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.37\u202fPM-300x253.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.37\u202fPM-768x648.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.37\u202fPM.png 1460w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">Outcome of the imaginary therapeutic trial, similar to the previous but focusing on the adults. Each symbol is one patient from the respective age-gender cohort and the red line indicates the na\u00efve population trend.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Examining this data we can make three statements about three variables of interest:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Gender is an independent variable (it does not \u201clisten to\u201d the other two)<\/li>\n<li class=\"wp-block-list-item\">Treatment depends on Gender (as we can see, in this setting the level given depends on Gender \u2014 women have been given, for some reason, a higher dosage.)<\/li>\n<li class=\"wp-block-list-item\">Outcome depends on both Gender and Treatment<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">According to these we can draw the causal graph as the following:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"f2f2f2\" data-has-transparency=\"false\" style=\"--dominant-color: #f2f2f2;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"876\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.59%25E2%2580%25AFPM-1024x876.png?resize=1024%2C876&#038;ssl=1\" alt=\"\" class=\"wp-image-597951 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.59\u202fPM-1024x876.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.59\u202fPM-300x257.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.59\u202fPM-768x657.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.38.59\u202fPM.png 1496w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">Simpson\u2019s paradox Graphic Model where Gender is a confounding variable between Treatment and Outcome<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Notice how each arrow contributes to communicate the statements above. As important, the lack of an arrow pointing into Gender conveys that it is an independent variable.<\/p>\n<p class=\"wp-block-paragraph\">We also notice that by having arrows pointing from Gender to Treatment and Outcome it is considered a <em>common cause<\/em> between them.<\/p>\n<p class=\"wp-block-paragraph\">The essence of the Simpson\u2019s paradox is that although the Outcome is effected by changes in Treatment, as expected, there is also a <em>backdoor path <\/em>flow of information via Gender.<\/p>\n<p class=\"wp-block-paragraph\">As you may have guessed by this stage, the solution to this paradox is that the common cause Gender is a confounding variable that needs to be <em>controlled<\/em>.<\/p>\n<p class=\"wp-block-paragraph\">Controlling for a variable, in terms of a causal graph, means eliminating the relationship between Gender and Treatment.<\/p>\n<p class=\"wp-block-paragraph\">This may be done in two manners:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Pre data collection: Setting up a <strong><em>Randomised Control Trial<\/em><\/strong><strong> <\/strong>(RCT) in which participants will be given dosage regardless of their Gender.<\/li>\n<li class=\"wp-block-list-item\">Post data collection: E.g, in this made up scenario the data has already been collected and hence we need to deal with what is referred to as <strong><em>Observational Data<\/em><\/strong><em>.<\/em>\n<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">In both pre- and post- data collection the elimination of the Treatment dependency of Gender (i.e, controlling for the Gender) may be done by modifying the graph such that the arrow between them is removed as in the following:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"dcdcdc\" data-has-transparency=\"false\" style=\"--dominant-color: #dcdcdc;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"908\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.21%25E2%2580%25AFPM-1024x908.png?resize=1024%2C908&#038;ssl=1\" alt=\"\" class=\"wp-image-597952 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.21\u202fPM-1024x908.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.21\u202fPM-300x266.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.21\u202fPM-768x681.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.21\u202fPM.png 1378w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">A modified version of the Simpson\u2019s paradox Graphic Model. The dark node means we control for Gender.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Applying this \u201cgraphical surgery\u201d means that the last two statements need to be modified (for convenience I\u2019ll write all three):<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Gender is an independent variable<\/li>\n<li class=\"wp-block-list-item\">Treatment is an independent variable<\/li>\n<li class=\"wp-block-list-item\">Outcome depends on Gender and Treatment (but with no backdoor path).<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">This enables obtaining the causal relationship of interest : we can assess the direct impact of modification Treatment on the Outcome.<\/p>\n<p class=\"wp-block-paragraph\">The process of controlling for a confounder, i.e manipulation of the data generation process, is formally referred to as applying an <em>intervention<\/em>. That is to say we are no longer passive observers of the data, but we are taking an active role in modification it to assess the causal impact.<\/p>\n<p class=\"wp-block-paragraph\">How is this manifested in practice?<\/p>\n<p class=\"wp-block-paragraph\">In the case of RCTs the researcher needs to control for important confounding variables. Here we limit the discussion to Gender (but in real world settings you can imagine other variables such as Age, Social Status and anything else that might be relevant to one\u2019s health).<\/p>\n<p class=\"wp-block-paragraph\">RCTs are considered the golden standard for causal analysis in many experimental settings thanks to its practice of confounding variables. That said, it has many setbacks:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">It may be <strong>expensive<\/strong> to recruit individuals and may be complicated <strong>logistically<\/strong>\n<\/li>\n<li class=\"wp-block-list-item\">The intervention under investigation may not be <strong>physically<\/strong> possible or <strong>ethical<\/strong> to conduct (e.g, one can\u2019t ask randomly selected people to smoke or not for ten years)<\/li>\n<li class=\"wp-block-list-item\">Artificial setting of a laboratory \u2014 not a true natural <strong>habitat<\/strong> of the population.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">Observational data on the other hand is much more readily available in the industry and academia and hence much cheaper and could be more representative of actual habits of the individuals. But as illustrated in the Simpson\u2019s diagram it may have confounding variables that need to be controlled.<\/p>\n<p class=\"wp-block-paragraph\">This is where ingenious solutions developed in the causal community in the past few decades are making headway. Detailing them are beyond the scope of this post, but I briefly mention how to learn more at the end.<\/p>\n<p class=\"wp-block-paragraph\">To resolve for this Simpson\u2019s paradox with the given observational data one<\/p>\n<ol class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Calculates for each cohort the impact of the change of the treatment on the outcome<\/li>\n<li class=\"wp-block-list-item\">Calculates a weighted average contribution of each cohort on the population.<\/li>\n<\/ol>\n<p class=\"wp-block-paragraph\">Here we will focus on intuition, but in a future post we will describe the maths behind this solution.<\/p>\n<p class=\"wp-block-paragraph\">I am sure that many analysts, just like myself, have noticed Simpson\u2019s at some stage in their data and hopefully have corrected for it. Now you know the name of this effect and hopefully start to appreciate how causal tools are useful.<\/p>\n<p class=\"wp-block-paragraph\"><strong>That said \u2026 being confused at this stage is OK <\/strong><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f615.png?ssl=1\" alt=\"\ud83d\ude15\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"><\/p>\n<p class=\"wp-block-paragraph\">I\u2019ll be the first to admit that I struggled to understand this concept and it took me three weekends of deep diving into examples to internalised it. This was the gateway drug to causality for me. Part of my process to understanding statistics is playing with data. For this purpose I created <a href=\"https:\/\/bit.ly\/simpson-calculator\">an interactive web application hosted in Streamlit<\/a> which I call Simpson\u2019s Calculator <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f9ee.png?ssl=1\" alt=\"\ud83e\uddee\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\">. I\u2019ll write a separate post for this in the future.<\/p>\n<p class=\"wp-block-paragraph\">Even if you are confused the main takeaways of Simpson\u2019s paradox is that:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">It is a situation where trends can exist in subgroups but reverse for the whole.<\/li>\n<li class=\"wp-block-list-item\">It may be resolved by identifying confounding variables between the treatment and the outcome variables and controlling for them.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">This raises the question \u2014 should we just control for all variables except for the treatment and outcome? Let\u2019s keep this in mind when resolving for the Berkson\u2019s paradox.<\/p>\n<h2 class=\"wp-block-heading\"><strong><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f99a.png?ssl=1\" alt=\"\ud83e\udd9a\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> Causal Graph Resolution of Berkson\u2019s Paradox<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">As in the previous section we are going to make clear statements about how we believe the data was generated and then draw these in a causal graph.<\/p>\n<p class=\"wp-block-paragraph\">Let\u2019s examine the case of the general population, for convenience I\u2019m copying the image from above:<\/p>\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" data-dominant-color=\"efeff2\" data-has-transparency=\"true\" style=\"--dominant-color: #efeff2;\" loading=\"lazy\" decoding=\"async\" width=\"676\" height=\"600\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.45%25E2%2580%25AFPM.png?resize=676%2C600&#038;ssl=1\" alt=\"\" class=\"wp-image-597954 has-transparency\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.45\u202fPM.png 676w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.45\u202fPM-300x266.png 300w\" sizes=\"auto, (max-width: 676px) 100vw, 676px\"><figcaption class=\"wp-element-caption\">General population. Source: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Berkson%27s_paradox\">Wikipedia<\/a>, created by <a href=\"https:\/\/commons.wikimedia.org\/wiki\/User:Cmglee\">Cmglee<\/a><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Here we understand that:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Talent is an independent variable<\/li>\n<li class=\"wp-block-list-item\">Attractiveness is an independent variable<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">A causal graph for this is quite simple, two nodes without an edge.<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"efefef\" data-has-transparency=\"false\" style=\"--dominant-color: #efefef;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"417\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.57%25E2%2580%25AFPM-1024x417.png?resize=1024%2C417&#038;ssl=1\" alt=\"\" class=\"wp-image-597956 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.57\u202fPM-1024x417.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.57\u202fPM-300x122.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.57\u202fPM-768x313.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.39.57\u202fPM.png 1380w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">In the general population ones Talent and Attractiveness are independent<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Let\u2019s examine the plot of the celebrity subset.<\/p>\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" data-dominant-color=\"efeff1\" data-has-transparency=\"true\" style=\"--dominant-color: #efeff1;\" loading=\"lazy\" decoding=\"async\" width=\"866\" height=\"740\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.52%25E2%2580%25AFPM.png?resize=866%2C740&#038;ssl=1\" alt=\"\" class=\"wp-image-597957 has-transparency\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.52\u202fPM.png 866w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.52\u202fPM-300x256.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.37.52\u202fPM-768x656.png 768w\" sizes=\"auto, (max-width: 866px) 100vw, 866px\"><figcaption class=\"wp-element-caption\">A subset of celebrities. Source: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Berkson%27s_paradox\">Wikipedia<\/a> created by <a href=\"https:\/\/en.wikipedia.org\/wiki\/Berkson%27s_paradox\">Cmglee<\/a><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">The cheeky insight from this mock data is that the more likely one is attractive the less they need to be talented to be a celebrity. Hence we can deduce that:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Talent is an independent variable<\/li>\n<li class=\"wp-block-list-item\">Attractiveness is an independent variable<\/li>\n<li class=\"wp-block-list-item\">Celebrity variable depends on both Talent and Attractiveness variables. (Imagine this variable is boolean as in: true for celebrities or false for not).<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">Hence we can draw the causal graph as:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"f3f3f3\" data-has-transparency=\"false\" style=\"--dominant-color: #f3f3f3;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"994\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.26%25E2%2580%25AFPM-1024x994.png?resize=1024%2C994&#038;ssl=1\" alt=\"\" class=\"wp-image-597948 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.26\u202fPM-1024x994.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.26\u202fPM-300x291.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.26\u202fPM-768x745.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.26\u202fPM.png 1294w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">Being a celebrity depends on Talent and Attractiveness<br \/>\n<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">By having arrows pointing into it Celebrity is a <em>collider<\/em> node between Talent and Attractiveness.<\/p>\n<p class=\"wp-block-paragraph\">Berkson\u2019s paradox is the fact that when controlling for celebrities we see an interesting trend (anti correlation between Attractiveness and Talent) not seen in the general population.<\/p>\n<p class=\"wp-block-paragraph\">This can be visualised in the causal graph that by confounding for the Celebrity parameter we are creating a spurious correlation between the otherwise independent variables Talent and Attractiveness. We can draw this as the following:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"e0e0e0\" data-has-transparency=\"false\" style=\"--dominant-color: #e0e0e0;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"945\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.40%25E2%2580%25AFPM-1024x945.png?resize=1024%2C945&#038;ssl=1\" alt=\"\" class=\"wp-image-597949 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.40\u202fPM-1024x945.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.40\u202fPM-300x277.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.40\u202fPM-768x708.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.40\u202fPM.png 1366w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">Berkson\u2019s paradox Graphic Model. The dark node means we control for Celebrity. Controlling this collider variable generates a spurious correlation (dashed line) between Talent and Attractiveness.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">The solution of this Berkson\u2019s paradox should be apparent here: Talent and Attractiveness are independent variables in general, but by controlling for the collider Celebrity node causes a spurious correlation in the data.<\/p>\n<h1 class=\"wp-block-heading\"><strong>Paradoxes Summary<\/strong><\/h1>\n<p class=\"wp-block-paragraph\">Let\u2019s compare the resolution of both paradoxes:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Resolving Simpson\u2019s Paradox is by <strong>controlling<\/strong> for common cause (Gender)<\/li>\n<li class=\"wp-block-list-item\">Resolving Berkson\u2019s Paradox is by <strong>not controlling <\/strong>for the collider (Celebrity)<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">The next figure combines both insights in the form of their causal graphs:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"ececec\" data-has-transparency=\"false\" style=\"--dominant-color: #ececec;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"483\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.50%25E2%2580%25AFPM-1024x483.png?resize=1024%2C483&#038;ssl=1\" alt=\"\" class=\"wp-image-597950 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.50\u202fPM-1024x483.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.50\u202fPM-300x141.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.50\u202fPM-768x362.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.40.50\u202fPM.png 1374w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">Graph models show how to resolve the paradoxes. Dark nodes are controlled for. Left: Modified graph to resolve Simpson\u2019s paradox by controlling for Gender. Right: To resolve for Berkson\u2019s paradox the collider should not be controlled.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">The main takeaway from the resolution of these paradoxes is that controlling for parameters requires a justification. Common causes should be controlled for but colliders should not.<\/p>\n<p class=\"wp-block-paragraph\">Even though this is common knowledge for those who study causality (e.g, Economics majors), it is unfortunate that most analysts and machine learning practitioners are not aware of this (including myself in 2020 after over 15 years of analysis and predictive modelling experience).<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">\u201c<em>Oddly, statisticians both over- and underrate the importance of confounders<\/em>\u201c<em> \u2014 Judea Pearl<\/em><\/p>\n<\/blockquote>\n<h2 class=\"wp-block-heading\"><strong>Summary<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">The main takeaway from this post is that the story behind the data is as important as the data itself.<\/p>\n<p class=\"wp-block-paragraph\">Appreciating this will help you avoid result misinterpretation as spurious correlations and, as demonstrated here, in Simpson\u2019s and Berskon\u2019s paradoxes.<\/p>\n<p class=\"wp-block-paragraph\">Causal Graphs are an essential tool to visualise the story behind the data. By using them to solve for the paradoxes we learnt that controlling for variables requires justification (common causes <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/2705.png?ssl=1\" alt=\"\u2705\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\">, colliders <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/26d4.png?ssl=1\" alt=\"\u26d4\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\">).<\/p>\n<p class=\"wp-block-paragraph\">For those interested in taking the next step in their causal journey I highly suggest mastering Simpson\u2019s paradox. One great way is by playing with data. Feel free to do so with my interactive \u201c<a href=\"https:\/\/bit.ly\/simpson-calculator\">Simpson-calculator<\/a>\u201d <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f9ee.png?ssl=1\" alt=\"\ud83e\uddee\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\">.<\/p>\n<p class=\"wp-block-paragraph\">Loved this post? <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f48c.png?ssl=1\" alt=\"\ud83d\udc8c\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> Join me on <a href=\"https:\/\/www.linkedin.com\/in\/eyal-kazin\">LinkedIn<\/a> or <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/2615.png?ssl=1\" alt=\"\u2615\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> <a href=\"http:\/\/buymeacoffee.com\/zurdo\">Buy me a coffee<\/a>!<\/p>\n<h2 class=\"wp-block-heading\"><strong>Credits<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">Unless otherwise noted, all images were created by the author.<\/p>\n<p class=\"wp-block-paragraph\">Many thanks to <a href=\"about:blank\">Jim Parr<\/a>, <a href=\"https:\/\/www.linkedin.com\/in\/reynoldswilliam\/\">Will Reynolds<\/a>, <a href=\"https:\/\/www.linkedin.com\/in\/hedva-kazin\/\">Hedva Kazin<\/a> and <a href=\"https:\/\/www.linkedin.com\/in\/betty-kazin-68711812\/\">Betty Kazin<\/a> for their useful comments.<\/p>\n<p class=\"wp-block-paragraph\">Wondering what your next step should be in your causal journey? Check out my new article on <a href=\"https:\/\/medium.com\/towards-data-science\/mastering-simpsons-paradox-my-gateway-drug-to-causality-87e10b613a80\">mastering Simpson\u2019s Paradox<\/a> \u2014 you will never look at data the same way. <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f50e.png?ssl=1\" alt=\"\ud83d\udd0e\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"><\/p>\n<h2 class=\"wp-block-heading\"><strong>Useful Resources<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">Here I provide resources that I find useful as well as a shopping list of topics for beginners to learn.<\/p>\n<h3 class=\"wp-block-heading\">\n<img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f4da.png?ssl=1\" alt=\"\ud83d\udcda\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> <strong>Books<\/strong><br \/>\n<\/h3>\n<figure class=\"wp-block-image size-large is-resized\"><img data-recalc-dims=\"1\" data-dominant-color=\"bdb9b2\" data-has-transparency=\"true\" loading=\"lazy\" decoding=\"async\" width=\"685\" height=\"1024\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.04%25E2%2580%25AFPM-685x1024.png?resize=685%2C1024&#038;ssl=1\" alt=\"\" class=\"wp-image-597970 has-transparency\" style=\"--dominant-color: #bdb9b2; width:254px;height:auto\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.04\u202fPM-685x1024.png 685w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.04\u202fPM-201x300.png 201w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.04\u202fPM-768x1148.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.04\u202fPM-1027x1536.png 1027w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.04\u202fPM.png 1058w\" sizes=\"auto, (max-width: 685px) 100vw, 685px\"><figcaption class=\"wp-element-caption\">Credit: <a href=\"https:\/\/unsplash.com\/@gaellemarcel\">Gaelle Marcel<\/a><\/figcaption><\/figure>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\n<em>The Book of Why<\/em> \u2014 popular science reading (NY Times level)<\/li>\n<li class=\"wp-block-list-item\">\n<em>Causal Inference in Statistics A Primer<\/em> \u2014 excellent short technical book (<a href=\"http:\/\/bayes.cs.ucla.edu\/PRIMER\/\">site<\/a>)<\/li>\n<li class=\"wp-block-list-item\">\n<em>Causal Inference and Discovery in Python<\/em> by <a href=\"https:\/\/medium.com\/u\/f390f1bdd353?source=post_page---user_mention--47142d5ec96a---------------------------------------\">Aleksander Molak<\/a> (<a href=\"https:\/\/www.packtpub.com\/product\/causal-inference-and-discovery-in-python\/9781804612989\">Packt<\/a>, <a href=\"https:\/\/github.com\/PacktPublishing\/Causal-Inference-and-Discovery-in-Python\">github<\/a>) \u2014 clearly explained with python applications <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f40d.png?ssl=1\" alt=\"\ud83d\udc0d\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\">.<\/li>\n<li class=\"wp-block-list-item\">\n<em>What If?<\/em> \u2014 a cohesive presentation of concepts of, and methods for, causal inference (<a href=\"https:\/\/www.hsph.harvard.edu\/miguel-hernan\/causal-inference-book\/\">site<\/a>, <a href=\"https:\/\/github.com\/jrfiedler\/causal_inference_python_code\">github<\/a>)<\/li>\n<li class=\"wp-block-list-item\">\n<em>Causal Inference The Mixtape<\/em> \u2014 Social Science focused using Python, R and Strata (<a href=\"https:\/\/mixtape.scunning.com\/\">site<\/a>, <a href=\"https:\/\/mixtape.scunning.com\/teaching-resources.html\">resources<\/a>, <a href=\"https:\/\/www.mixtapesessions.io\/index.html\">mooc<\/a>)<\/li>\n<li class=\"wp-block-list-item\">\n<em>Counterfactuals and Causal Inference<\/em> \u2014 Methods and Principles (Social Science focused)<\/li>\n<\/ul>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"abcbd9\" data-has-transparency=\"true\" style=\"--dominant-color: #abcbd9;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"317\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.18%25E2%2580%25AFPM-1024x317.png?resize=1024%2C317&#038;ssl=1\" alt=\"\" class=\"wp-image-597971 has-transparency\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.18\u202fPM-1024x317.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.18\u202fPM-300x93.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.18\u202fPM-768x237.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.18\u202fPM.png 1404w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><\/figure>\n<p class=\"wp-block-paragraph\">This list is far from comprehensive, but I\u2019m glad to add to it if anyone has suggestions (please mention why the book stands out from the pack).<\/p>\n<h3 class=\"wp-block-heading\">\n<img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f50f.png?ssl=1\" alt=\"\ud83d\udd0f\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> <strong>Courses<\/strong><br \/>\n<\/h3>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"444544\" data-has-transparency=\"true\" style=\"--dominant-color: #444544;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"741\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.30%25E2%2580%25AFPM-1024x741.png?resize=1024%2C741&#038;ssl=1\" alt=\"\" class=\"wp-image-597955 has-transparency\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.30\u202fPM-1024x741.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.30\u202fPM-300x217.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.30\u202fPM-768x556.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.30\u202fPM.png 1396w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">Credit: <a hrfef=\"Credit: Austrian National Library\">Austrian National Library<\/a><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">There are probably a few courses online. I love the <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f193.png?ssl=1\" alt=\"\ud83c\udd93\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> one of Brady Neil <a href=\"https:\/\/www.bradyneal.com\/causal-inference-course\"><strong>bradyneal.com\/causal-inference-course<\/strong><\/a><strong>.<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Clearly explained<\/li>\n<li class=\"wp-block-list-item\">Covers many aspects<\/li>\n<li class=\"wp-block-list-item\">Thorough<\/li>\n<li class=\"wp-block-list-item\">Provides memorable examples<\/li>\n<li class=\"wp-block-list-item\">F.R.E.E<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">One paid course <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f4b0.png?ssl=1\" alt=\"\ud83d\udcb0\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> that is targeted to practitioners is <a href=\"https:\/\/altdeep.ai\/p\/causalml\">Altdeep<\/a>.<\/p>\n<h3 class=\"wp-block-heading\">\n<img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f4be.png?ssl=1\" alt=\"\ud83d\udcbe\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> <strong>Software<\/strong><br \/>\n<\/h3>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"393d35\" data-has-transparency=\"true\" style=\"--dominant-color: #393d35;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"769\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.40%25E2%2580%25AFPM-1024x769.png?resize=1024%2C769&#038;ssl=1\" alt=\"\" class=\"wp-image-597963 has-transparency\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.40\u202fPM-1024x769.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.40\u202fPM-300x225.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.40\u202fPM-768x577.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.40\u202fPM.png 1390w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">Credit: <a href=\"https:\/\/unsplash.com\/@artturijalli\">Artturi Jalli<\/a><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">This list is far from comprehensive because the space is rapidly growing:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\n<a href=\"http:\/\/www.dagitty.net\/\">dagitty.net<\/a> \u2014 a web application to construct and interpret causal graphs<\/li>\n<li class=\"wp-block-list-item\">\n<a href=\"https:\/\/www.pywhy.org\/\">PyWhy<\/a> \u2014 open source ecosystem for Causal Machine Learning (includes popular packages such as <a href=\"https:\/\/www.pywhy.org\/dowhy\/v0.11.1\/\">dowhy<\/a> and <a href=\"https:\/\/econml.azurewebsites.net\/\">econml<\/a>).<\/li>\n<li class=\"wp-block-list-item\">\n<a href=\"https:\/\/github.com\/pymc-labs\/CausalPy\">CausalPy<\/a> \u2014 A Bayesian approach to causality by the good people in <a href=\"https:\/\/medium.com\/u\/7c6b7b6803cd?source=post_page---user_mention--47142d5ec96a---------------------------------------\">PyMC Developers<\/a>.<\/li>\n<li class=\"wp-block-list-item\">\n<a href=\"https:\/\/github.com\/jakobrunge\/tigramite\">tigramite<\/a> \u2014 Causal inference with a focus on time series data<\/li>\n<li class=\"wp-block-list-item\">\n<a href=\"https:\/\/causalwizard.app\/\">Causal Wizard<\/a> \u2014 software for effect estimation by <a href=\"https:\/\/medium.com\/u\/a78412ea4d96?source=post_page---user_mention--47142d5ec96a---------------------------------------\">Causal Wizard app<\/a>\n<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/medium.com\/u\/a78412ea4d96?source=post_page---user_mention--47142d5ec96a---------------------------------------\">Causal Wizard app<\/a> also have an article about <a href=\"https:\/\/medium.com\/@causalwizard\/online-causal-diagram-and-dag-drawing-editing-tools-900bb1815c86\">Causal Diagram tools<\/a>.<\/p>\n<h3 class=\"wp-block-heading\"><strong><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f43e.png?ssl=1\" alt=\"\ud83d\udc3e\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> Suggested Next Steps In The Causal Journey<\/strong><\/h3>\n<p class=\"wp-block-paragraph\">Here I highlight a list of topics which I would have found useful when I started my learnings in the field. If I\u2019m missing anything I\u2019d be more than glad to get feedback and adding. I bold face the ones which were briefly discussed here.<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"f0e9d4\" data-has-transparency=\"true\" style=\"--dominant-color: #f0e9d4;\" loading=\"lazy\" decoding=\"async\" width=\"845\" height=\"1024\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.53%25E2%2580%25AFPM-845x1024.png?resize=845%2C1024&#038;ssl=1\" alt=\"\" class=\"wp-image-597946 has-transparency\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.53\u202fPM-845x1024.png 845w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.53\u202fPM-247x300.png 247w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.53\u202fPM-768x931.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.53\u202fPM-1267x1536.png 1267w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/Screenshot-2025-02-14-at-1.41.53\u202fPM.png 1318w\" sizes=\"auto, (max-width: 845px) 100vw, 845px\"><figcaption class=\"wp-element-caption\">Pearl\u2019s Causal Hierarchy of seeing, doing, imagining and their applications. This is an approved modification of the original illustration by <a href=\"https:\/\/medium.com\/u\/67a090f94b6f?source=post_page---user_mention--47142d5ec96a---------------------------------------\">Maayan Harel<\/a> from <a h=\"\" ref=\"https:\/\/www.maayanvisuals.com\/\">MaayanVisuals.com<\/a> in <a href=\"https:\/\/bayes.cs.ucla.edu\/WHY\/\">The Book of Why<\/a>.<\/figcaption><\/figure>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Pearl\u2019s Causal Hierarchy of seeing, doing and imagining (figure above)<\/li>\n<li class=\"wp-block-list-item\"><strong>Observational data vs. Randomised Control Trials<\/strong><\/li>\n<li class=\"wp-block-list-item\">d-separation, <strong>common causes<\/strong>, <strong>colliders<\/strong>, mediators, instrumental variables<\/li>\n<li class=\"wp-block-list-item\"><strong>Causal Graphs<\/strong><\/li>\n<li class=\"wp-block-list-item\">Structural Causal Models<\/li>\n<li class=\"wp-block-list-item\">Assumptions: Ignorability, SUTVA, Consistency, Positivity<\/li>\n<li class=\"wp-block-list-item\">\u201cDo\u201d Algebra \u2014 assessing impact on cohorts by intervention<\/li>\n<li class=\"wp-block-list-item\">Counterfactuals \u2014 assessing impact on individuals by comparing real outcomes to potential ones<\/li>\n<li class=\"wp-block-list-item\">The fundamental problem of causality<\/li>\n<li class=\"wp-block-list-item\">Estimand, Estimator, Estimate, Identifiability \u2014 relating causal definitions to observable statistics (e.g, conditional probabilities)<\/li>\n<li class=\"wp-block-list-item\">Causal Discovery \u2014 finding causal graphs with data (e.g, Markov Equivalence)<\/li>\n<li class=\"wp-block-list-item\">Causal Machine Learning (e.g, Double Machine Learning)<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">For completeness it is useful to know that there are different streams of causality. Although there is a lot of overlap you may find that methods differ in naming convention due to development in different fields of research: Computer Science, Social Sciences, Health, Economics<\/p>\n<p class=\"wp-block-paragraph\">Here I used definitions mostly from the Pearlian perspective (as developed in the field of computer science).<\/p>\n<h2 class=\"wp-block-heading\"><strong>The Story Behind This Post<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">This narrative is a result of two study groups that I have conducted in a previous role to get myself and colleagues to learn about causality, which I felt missing in my skill set. If there is any interest I\u2019m glad to write a post about the study group experience.<\/p>\n<p class=\"wp-block-paragraph\">This intro was created as the one I felt that I needed when I started my journey in causality.<\/p>\n<p class=\"wp-block-paragraph\">In the<a href=\"https:\/\/bit.ly\/start-ask-why-post\"> first iteration of this post<\/a> I wrote and presented the limitations of spurious correlations and Simpson\u2019s paradox. The main reason for this revision to focus on two paradoxes is that, whereas most causality intros focus on the limitations of correlations, I feel that understanding the concept of justification of confounders is important for all analysts and machine learning practitioners to be aware of.<\/p>\n<p class=\"wp-block-paragraph\">On September 5th 2024 I have presented this content in a contributed talk at the Royal Statistical Society Annual Conference in Brighton, England (<a href=\"https:\/\/virtual.oxfordabstracts.com\/#\/event\/6693\/submission\/18\">abstract link<\/a>).<\/p>\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXcKUrKPA_kVX5VS2r6ig6wV6FNGbyshqIMH6hB8pS_-AAQKHIcPZlaO9xKYTHSKlHEVWw-2viIU6U_qB5Z8DXFWHuRewDs79XQGEak5LrSXetRQQXwTFolN1KsMlnp1LWguwdcp?key=rZjSY2dvT6tH59HazbqbK0f4\" alt=\"\"><\/figure>\n<p class=\"wp-block-paragraph\">Unfortunately there is no recording but there are of previous talks of mine:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/bit.ly\/start-ask-why-pydata\">PyData Global 2021<\/a><\/li>\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/bit.ly\/start-ask-why-europython\">EuroPython 2021<\/a><\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">The slides are available at <a href=\"http:\/\/bit.ly\/start-ask-why\">bit.ly\/start-ask-why<\/a>. Presenting this material for the first time at PyData Global 2021<\/p>\n<p>The post <a href=\"https:\/\/towardsdatascience.com\/start-asking-your-data-why-a-gentle-intro-to-causality\/\">\u27a1\ufe0f Start Asking Your Data \u2018Why?\u2019 \u2014 A Gentle Intro To Causality<\/a> appeared first on <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a>.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Eyal Kazin PhD<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/start-asking-your-data-why-a-gentle-intro-to-causality\/\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u27a1\ufe0f Start Asking Your Data \u2018Why?\u2019 \u2014 A Gentle Intro To Causality Correlation does not imply causation. It turns out, however, that with some simple ingenious tricks one can, potentially, unveil causal relationships within standard observational data, without having to resort to expensive randomised control trials. This post is targeted towards anyone making data driven [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,1756,210,1757,83,312,67],"tags":[1758,84,163],"class_list":["post-1873","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-causal-analytics","category-causal-inference","category-causality","category-data-science","category-decision-making","category-deep-dives","tag-causality","tag-data","tag-your"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/1873"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=1873"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/1873\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=1873"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=1873"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=1873"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}