{"id":727,"date":"2024-12-21T07:04:55","date_gmt":"2024-12-21T07:04:55","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2024\/12\/21\/when-averages-lie-moving-beyond-single-point-predictions-23201e8c04c8\/"},"modified":"2024-12-21T07:04:55","modified_gmt":"2024-12-21T07:04:55","slug":"when-averages-lie-moving-beyond-single-point-predictions-23201e8c04c8","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2024\/12\/21\/when-averages-lie-moving-beyond-single-point-predictions-23201e8c04c8\/","title":{"rendered":"When Averages Lie: Moving Beyond Single-Point Predictions"},"content":{"rendered":"<p>    When Averages Lie: Moving Beyond Single-Point Predictions<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<h4>The Case for Predicting Full Probability Distributions in Decision-Making<\/h4>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2AfnVEMTmE9ETvdimmXusbfw.jpeg?ssl=1\"><\/figure>\n<p>Some people like hot coffee, some people like iced coffee, but no one likes lukewarm coffee. Yet, a simple model trained on coffee temperatures might predict that the next coffee served should be\u2026 lukewarm. This illustrates a fundamental problem in predictive modeling: focusing on single point estimates (e.g., averages) can lead us to meaningless or even misleading conclusions.<\/p>\n<p>In \u201c<a href=\"https:\/\/medium.com\/@loic.merckel\/the-crystal-ball-fallacy-what-perfect-predictive-models-really-mean-aa843067ee30\">The Crystal Ball Fallacy<\/a>\u201d (Merckel, 2024b), we explored how even a perfect predictive model does not tell us exactly what will happen\u200a\u2014\u200ait tells us what could happen and how likely each outcome is. In other words, it reveals the true distribution of a random variable. While such a perfect model remains hypothetical, real-world models should still strive to approximate these true distributions.<\/p>\n<p>Yet many predictive models used in the corporate world do something quite different: they focus solely on point estimates\u200a\u2014\u200atypically the mean or the mode\u200a\u2014\u200arather than attempting to capture the full range of possibilities. This is not just a matter of how the predictions are used; this limitation is inherent in the design of many conventional machine learning algorithms. Random forests, generalized linear models (GLM), artificial neural networks (ANNs), and gradient boosting machines, among others, are all designed to predict the expected value (mean) of a distribution when used for regression tasks. In classification problems, while logistic regression and other GLMs naturally attempt to estimate probabilities of class membership, tree-based methods like random forests and gradient boosting produce raw scores that would require additional calibration steps (like isotonic regression or Platt scaling) to be transformed into meaningful probabilities. Yet in practice, this calibration is rarely performed, and even when uncertainty information is available (i.e., the probabilities), it is typically discarded in favor of the single most likely class, i.e., the\u00a0mode.<\/p>\n<p>This oversimplification is sometimes not just inadequate; it can lead to fundamentally wrong conclusions, much like our lukewarm coffee predictor. A stark example is the Gaussian copula formula used to price collateralized debt obligations (CDOs) before the 2008 financial crisis. By reducing the complex relationships between mortgage defaults to a single correlation number, among other issues, this model catastrophically underestimated the possibility of simultaneous defaults (MacKenzie &amp; Spears, 2014). This systematic underestimation of extreme risks is so pervasive that some investment funds, like Universa Investments advised by Nassim Taleb, incorporate strategies to capitalize on it. They recognize that markets consistently undervalue the probability and impact of extreme events (Patterson, 2023). When we reduce a complex distribution of possible outcomes to a single number, we lose critical information about uncertainty, risk, and potential extreme events that could drastically impact decision-making.<\/p>\n<p>On the other hand, some quantitative trading firms have built their success partly by properly modeling these complex distributions. When asked about Renaissance Technologies\u2019 approach\u200a\u2014\u200awhose Medallion fund purportedly achieved returns of 66% annually before fees from 1988 to 2018 (Zuckerman, 2019)\u200a\u2014\u200afounder Jim Simons emphasized that they carefully consider that market risk \u201cis typically not a normal distribution, the tails of a distribution are heavier and the inside is not as heavy\u201d (Simons, 2013, 47:41), highlighting the critical importance of looking beyond simple averages.<\/p>\n<p>Why, then, do we persist in using point estimates despite their clear limitations? The reasons may be both practical and cultural. Predicting distributions is technically more challenging than predicting single values, requiring more sophisticated models and greater computational resources. But more fundamentally, most business processes and tools are simply not designed to handle distributional thinking. You cannot put a probability distribution in a spreadsheet cell, and many decision-making frameworks demand concrete numbers rather than ranges of possibilities. Moreover, as Kahneman (2011) notes in his analysis of human decision-making, we are naturally inclined to think in terms of specific scenarios rather than statistical distributions\u200a\u2014\u200aour intuitive thinking prefers simple, concrete answers over probabilistic ones.<\/p>\n<p>Let us examine actual housing market data to illustrate potential issues with single-point valuation and possible modeling techniques to capture the full distribution of possible\u00a0values.<\/p>\n<h3>A Deep Dive into Property\u00a0Pricing<\/h3>\n<p>In this section, we use the French Real Estate Transactions (DVF) dataset provided by the French government (gouv.fr, 2024), which contains comprehensive records of property transactions across France. For this analysis, we focus on sale prices, property surface areas, and the number of rooms for the years ranging from 2014 to 2024. Notably, we exclude critical information such as geolocation, as our aim is not to predict house prices but to demonstrate the benefits of predicting distributions over relying solely on single-point estimates.<\/p>\n<p>First, we will go through a fictional\u200a\u2014\u200ayet most likely \u00e0 clef\u200a\u2014\u200acase study where a common machine learning technique is put into action for planning an ambitious real estate operation. Subsequently, we will adopt a critical stance on this case and offer alternatives that many may prefer in order to be better prepared for pulling off the\u00a0trade.<\/p>\n<h4>Case Study: The Homer &amp; Lisa Reliance on AI for Real Estate\u00a0Trading<\/h4>\n<p>Homer and Lisa live in Paris. They expect the family to grow and envisage to sell their two-room flat to fund the acquisition of a four-room property. Given the operational and maintenance costs, and the capacity of their newly acquired state-of-the-art Roomba with all options, they reckoned that 90m\u00b2 is the perfect surface area for them. They want to estimate how much they need to save\/borrow to complement the proceeds from the sale. Homer followed a MOOC on machine learning just before graduating in advanced French literature last year, and immediately found\u200a\u2014\u200athanks to his network\u200a\u2014\u200aa data scientist role at a large reputable traditional firm that was heavily investing in expanding (admittedly from scratch, really) its AI capacity to avoid missing out. Now a Principal Senior Lead Data Scientist, after almost a year of experience, he knows quite a bit! (He even <a href=\"https:\/\/medium.com\/@loic.merckel\/data-driven-or-data-derailed-lessons-from-the-hello-world-classifier-764fdf4dbb60\">works for a zoo as a side hustle<\/a>, where his performance has not remained unnoticed\u200a\u2014\u200aMerckel,\u00a02024a.)<\/p>\n<p>Following some googling, he found the real estate dataset freely provided by the government. He did a bit of cleaning, filtering, and aggregating to obtain the perfect ingredients for his ordinary least squares model (OLS for those in the know). He can now confidently predict prices, in the Paris area, from both the number of rooms and the surface. Their 2-room, 40m\u00b2, flat is worth 365,116\u20ac. And a 4-room, 90m\u00b2, reaches 804,911\u20ac. That is a no-brainer; they must calculate the difference, i.e., 439,795\u20ac.<\/p>\n<h4>Homer &amp; Lisa: The Ones Playing Darts\u2026 Unknowingly!<\/h4>\n<p>Do Homer and Lisa need to save\/borrow 439,795\u20ac? The model certainly suggests so. But is that\u00a0so?<\/p>\n<p>Perhaps Homer, if only he knew, could have provided confidence intervals? Using OLS, confidence intervals can either be estimated empirically via bootstrapping or analytically using standard error-based methods.<\/p>\n<p>Besides, even before that, he could have looked at the price distribution, and realized the default OLS methods may not be the best\u00a0choice\u2026<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2AwRwQEWuzpNw6TDkensFbxA.jpeg?ssl=1\"><figcaption><strong>Figure 1: Real Estate Prices Near Paris (2014\u20132024):<\/strong> The left plot illustrates the distribution of real estate prices within a 7km radius of central Paris. The right plot shows the distribution of the natural logarithm of those prices. In both histograms, the final bar represents the cumulative count of properties priced above 2,000,000\u20ac (or log(2,000,000) in the logarithmic scale). Image by the\u00a0author.<\/figcaption><\/figure>\n<p>The right-skewed shape with a long tail is hard to miss. For predictive modeling (as opposed to, e.g., explanatory modeling), the primary concern with OLS is not necessarily the normality (and homoscedasticity) of errors but the potential for extreme values in the long tail to disproportionately influence the model\u200a\u2014\u200aOLS minimizes squared errors, making it sensitive to extreme observations, particularly those that deviate significantly from the Gaussian distribution assumed for the\u00a0errors.<\/p>\n<p>A Generalized Linear Model (GLM) extends the linear model framework by directly specifying a distribution for the response variable (from the exponential family) and using a \u201clink function\u201d to connect the linear predictor to the mean of that distribution. While linear models assume normally distributed errors and estimate the expected response E(Y) directly through a linear predictor, GLMs allow for different response distributions and transform the relationship between the linear predictor and E(Y) through the link function.<\/p>\n<p>Let us revisit Homer and Lisa\u2019s situation using a simpler but related approach. Rather than implementing a GLM, we can transform the data by taking the natural logarithm of prices before applying a linear model. This implies we are modeling prices as following a log-normal distribution (Figure 1 presents the distribution of prices and the log version). When transforming predictions back to the original scale, we need to account for the bias introduced by the log transformation using Duan\u2019s smearing estimator (Duan, 1983). Using this bias-corrected log-normal model and fitting it on properties around Paris, their current 2-room, 40m\u00b2 flat is estimated at 337,844\u20ac, while their target 4-room, 90m\u00b2 property would cost around 751,884\u20ac, hence a need for an additional 414,040\u20ac.<\/p>\n<p>The log-normal model with smearing correction is particularly suitable for this context because it not only reflects multiplicative relationships, such as price increasing proportionally (by a factor) rather than by a fixed amount when the number of rooms or surface area increases, but also properly accounts for the retransformation bias that would otherwise lead to systematic underestimation of\u00a0prices.<\/p>\n<p>To better understand the uncertainty in these predictions, we can examine their confidence intervals. The 95% bootstrap confidence interval [400,740\u20ac\u200a\u2014\u200a418,618\u20ac] for the mean price difference means that if we were to repeat this sampling process many times, about 95% of such intervals would contain the true mean price difference. This interval is more reliable in this context than the standard error-based 95% confidence interval because it does not depend on strict parametric assumptions about the model, such as the distribution of errors or the adequacy of the model\u2019s specification. Instead, it captures the observed data\u2019s variability and complexity, accounting for unmodeled factors and potential deviations from idealized assumptions. For instance, our model only considers the number of rooms and surface area, while real estate prices in Paris are influenced by many other factors\u200a\u2014\u200aproximity to metro stations, architectural style, floor level, building condition, and local neighborhood dynamics, and even broader economic conditions such as prevailing interest\u00a0rates.<\/p>\n<p>In light of this analysis, the log-normal model provides a new and arguably more realistic point estimate of 414,040\u20ac for the price difference. However, the confidence interval, while statistically rigorous, might not be the most useful for Homer and Lisa\u2019s practical planning needs. Instead, to better understand the full range of possible prices and provide more actionable insights for their planning, we might turn to Bayesian modeling. This approach would allow us to estimate the complete probability distribution of potential price differences, rather than just point estimates and confidence intervals.<\/p>\n<h4>The Prior, The Posterior, and The Uncertain<\/h4>\n<p>Bayesian modeling offers a more comprehensive approach to understanding uncertainty in predictions. Instead of calculating just a single \u201cbest guess\u201d price difference or even a confidence interval, Bayesian methods provide the full probability distribution of possible\u00a0prices.<\/p>\n<p>The process begins with expressing our \u201cprior beliefs\u201d about property prices\u200a\u2014\u200awhat we consider reasonable based on existing knowledge. In practice, this involves defining prior distributions for the parameters of the model (e.g., the weights of the number of rooms and surface area) and specifying how we believe the data is generated through a likelihood function (which gives us the probability of observing prices given our model parameters). We then incorporate actual sales data (our \u201cevidence\u201d) into the model. By combining these through Bayes\u2019 theorem, we derive the \u201cposterior distribution,\u201d which provides an updated view of the parameters and predictions, reflecting the uncertainty in our estimates given the data. This posterior distribution is what Homer and Lisa would truly find valuable.<\/p>\n<p>Given the right-skewed nature of the price data, a log-normal distribution appears to be a reasonable assumption for the likelihood. This choice should be validated with posterior predictive checks to ensure it adequately captures the data\u2019s characteristics. For the parameters, Half-Gaussian distributions constrained to be positive can reflect our assumption that prices increase with the number of rooms and surface area. The width of these priors reflects the range of possible effects, capturing our uncertainty in how much prices increase with additional rooms or surface\u00a0area.<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2AOqdD3TsHmEwr92PkvTsSMA.jpeg?ssl=1\"><figcaption><strong>Figure 2: Predicted Price Distributions for 2-Room (40m\u00b2) and 4-Room (90m\u00b2) Properties: <\/strong>The left plot shows the predicted price distribution for a 2-room, 40m\u00b2 property, while the right plot illustrates the predicted price distribution for a 4-room, 90m\u00b2 property. Image by the\u00a0author.<\/figcaption><\/figure>\n<p>The Bayesian approach provides a stark contrast to our earlier methods. While the OLS and <em>pseudo<\/em>-GLM (so called because the log-normal distribution is not a member of the exponential family) gave us single predictions with some uncertainty bounds, the Bayesian model reveals complete probability distributions for both properties. Figure 2 illustrates these predicted price distributions, showing not just point estimates but the full range of likely prices for each property type. The overlapping regions between the two distributions reveal that housing prices are not strictly determined by size and room count\u200a\u2014\u200aunmodeled factors like location quality, building condition, or market timing can sometimes make smaller properties more expensive than larger\u00a0ones.<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2AtoQuYne2q7uwUXF9uWMCgg.jpeg?ssl=1\"><figcaption><strong>Figure 3: Distribution of Predicted Price Differences Between 2-Room (40m\u00b2) and 4-Room (90m\u00b2) Properties:<\/strong> This plot illustrates the distribution of predicted price differences, obtained via a Monte Carlo simulation, capturing the uncertainty in the model parameters. The mean price difference is approximately 405,697\u20ac, while the median is 337,281\u20ac, reflecting a slight right skew in the distribution. Key percentiles indicate a wide range of variability: the 10th percentile is -53,318\u20ac, the 25th percentile is 126,602\u20ac, the 75th percentile is 611,492\u20ac, and the 90th percentile is 956,934\u20ac. The standard deviation of 448,854\u20ac highlights significant uncertainty in these predictions. Image by the\u00a0author.<\/figcaption><\/figure>\n<p>To understand what this means for Homer and Lisa\u2019s situation, we need to estimate the distribution of price differences between the two properties. Using Monte Carlo simulation, we repeatedly draw samples from both predicted distributions and calculate their differences, building up the distribution shown in Figure 3. The results are sobering: while the mean difference suggests they would need to find an additional 405,697\u20ac, there is substantial uncertainty around this figure. In fact, approximately 13.4% of the simulated scenarios result in a negative price difference, meaning there is a non-negligible chance they could actually make money on the transaction. However, they should also be prepared for the possibility of needing significantly more money\u200a\u2014\u200athere is a 25% chance they will need over 611,492\u20ac\u200a\u2014\u200aand 10% over 956,934\u20ac\u200a\u2014\u200aextra to make the\u00a0upgrade.<\/p>\n<p>This more complete picture of uncertainty gives Homer and Lisa a much better foundation for their decision-making than the seemingly precise single numbers provided by our earlier analyses.<\/p>\n<h4>Sometimes Less is More: The One With The Raw\u00a0Data<\/h4>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2AVSoeCZnwCIDG7c62vmt2tg.jpeg?ssl=1\"><figcaption><strong>Figure 4: Distribution of Simulated Price Differences Between 2-Room (40m\u00b2) and 4-Room (90m\u00b2) Properties:<\/strong> This distribution is obtained through Monte Carlo simulation by randomly pairing actual transactions of 2-room (35\u201345m\u00b2) and 4-room (85\u201395m\u00b2) properties. The mean price difference is 484,672\u20ac (median: 480,000\u20ac), with a substantial spread shown by the 90% percentile interval ranging from -52,810\u20ac to 1,014,325\u20ac. The shaded region below zero, representing about 6.6% of scenarios, indicates cases where a 4-room property might be found at a lower price than a 2-room one. The distribution\u2019s right skew suggests that while most price differences cluster around the median, there is a notable chance of encountering much larger differences, with 5% of cases exceeding 1,014,325\u20ac. Image by the\u00a0author.<\/figcaption><\/figure>\n<p>Rather than relying on sophisticated Bayesian modeling, we can gain clear insights from directly analyzing similar transactions. Looking at properties around Paris, we found 36,265 2-room flats (35\u201345m\u00b2) and 4,145 4-room properties (85\u201395m\u00b2), providing a rich dataset of actual market behavior.<\/p>\n<p>The data shows substantial price variation. Two-room properties have a mean price of 329,080\u20ac and a median price of 323,000\u20ac, with 90% of prices falling between 150,000\u20ac and 523,650\u20ac. Four-room properties show even wider variation, with a mean price of 812,015\u20ac, a median price of 802,090\u20ac and a 90% range from 315,200\u20ac to 1,309,227\u20ac.<\/p>\n<p>Using Monte Carlo simulation to randomly pair properties, we can estimate what Homer and Lisa might face. The mean price difference is 484,672\u20ac and the median price difference is 480,000\u20ac, with the middle 50% of scenarios requiring between 287,488\u20ac and 673,000\u20ac. Moreover, in 6.6% of cases, they might even find a 4-room property cheaper than their 2-room sale and make\u00a0money.<\/p>\n<p>This straightforward approach uses actual transactions rather than model predictions, making no assumptions about price relationships while capturing real market variability. For Homer and Lisa\u2019s planning, the message is clear: while they should prepare for needing around 480,000\u20ac, they should be ready for scenarios requiring significantly more or less. Understanding this range of possibilities is crucial for their financial planning.<\/p>\n<p>This simple technique works particularly well here because we have a dense dataset with over 40,000 relevant transactions across our target property categories. However, in many situations relying on predictive modeling, we might face sparse data. In such cases, we would need to interpolate between different data points or extrapolate beyond our available data. This is where Bayesian models are particularly powerful\u2026<\/p>\n<h3>Final Remarks<\/h3>\n<p>The journey through these analytical approaches\u200a\u2014\u200aOLS, log-normal modeling, Bayesian analysis, and Monte Carlo simulation\u200a\u2014\u200aoffers more than a range of price predictions. It highlights how we can handle uncertainty in predictive modeling with increasing sophistication. From the deceptively precise OLS estimate (439,795\u20ac) to the nuanced log-normal model (414,040\u20ac), and finally, to distributional insights provided by Bayesian and Monte Carlo methods (with means of 405,697\u20ac and 484,672\u20ac, respectively), each method provides a unique perspective on the same\u00a0problem.<\/p>\n<p>This progression demonstrates when distributional thinking becomes beneficial. For high-stakes, one-off decisions like Homer and Lisa\u2019s, understanding the full range of possibilities provides a clear advantage. In contrast, repetitive decisions with low individual stakes, like online ad placements, can often rely on simple point estimates. However, in domains where tail risks carry significant consequences\u200a\u2014\u200asuch as portfolio management or major financial planning\u200a\u2014\u200amodeling the full distribution is not just beneficial but fundamentally wise.<\/p>\n<p>It is important to acknowledge the real-world complexities simplified in this case study. Factors like interest rates, temporal dynamics, transaction costs, and other variables significantly influence real estate pricing. Our objective was not to develop a comprehensive housing price predictor but to illustrate, step-by-step, the progression from a naive single-point estimate to a full distribution.<\/p>\n<p>It is worth noting that, given our primary aim of illustrating this progression\u200a\u2014\u200afrom point estimates to distributional thinking\u200a\u2014\u200awe deliberately kept our models simple. The OLS and <em>pseudo<\/em>-GLM implementations were used without interaction terms\u200a\u2014\u200aand thus without regularization or hyperparameter tuning\u200a\u2014\u200aand minimal preprocessing was applied. While the high correlation between the number of rooms and surface area is not particularly problematic for predictive modeling in general, it can affect the sampling efficiency of the Markov chain Monte Carlo (MCMC) methods used in our Bayesian models by creating ridges in the posterior distribution that are harder to explore efficiently (indeed, we observed a strong ridge structure with correlation of -0.74 between these parameters, though effective sample sizes remained reasonable at about 50% of total samples, suggesting our inference should be sufficiently stable for our illustrative purposes). For the Bayesian approaches specifically, there is substantial room for improvement through defining more informative priors or the inclusion of additional covariates. While such optimizations might yield somewhat different numerical results, they would likely not fundamentally alter the key insights about the importance of considering full distributions rather than point estimates.<\/p>\n<p>Finally, we must accept that even our understanding of uncertainty is uncertain. The confidence we place in distributional predictions depends on model assumptions and data quality. This \u201cuncertainty about uncertainty\u201d challenges us not only to refine our models but also to communicate their limitations transparently.<\/p>\n<p>Embracing distributional thinking is not merely a technical upgrade\u200a\u2014\u200ait is a mindset shift. Single-point predictions may feel actionable, but they often provide a false sense of precision, ignoring the inherent variability of outcomes. By considering the full spectrum of possibilities, we equip ourselves to make better-informed decisions and develop strategies that are better prepared for the randomness of the real\u00a0world.<\/p>\n<h3>Sources<\/h3>\n<h4>References<\/h4>\n<p>&#8211; <strong>Duan, N.<\/strong> (1983). <em>Smearing estimate: A nonparametric retransformation method<\/em>. Journal of the American Statistical Association, 78(383), 605\u2013610. Available from <a href=\"https:\/\/www.jstor.org\/stable\/2288126\">https:\/\/www.jstor.org\/stable\/2288126<\/a>.<br \/>&#8211; <strong>Kahneman, D.<\/strong> (2011). <em>Thinking, Fast and Slow. <\/em>Kindle edition. ASIN B00555X8OA.<br \/>&#8211; <strong>MacKenzie, D., &amp; Spears, T.<\/strong> (2014). <em>\u2018The formula that killed Wall Street\u2019: The Gaussian copula and modelling practices in investment banking<\/em>. Social Studies of Science, 44(3), 393\u2013417. Available from <a href=\"https:\/\/www.jstor.org\/stable\/43284238\">https:\/\/www.jstor.org\/stable\/43284238<\/a>.<br \/>&#8211; <strong>Patterson, S.<\/strong> (2023). <em>Chaos Kings: How Wall Street Traders Make Billions in the New Age of Crisis. <\/em>Kindle edition. ASIN B0BSB49L11.<br \/>&#8211; <strong>Zuckerman, G.<\/strong> (2019). <em>The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution. <\/em>Kindle edition. ASIN B07NLFC63Y.<\/p>\n<h4>Notes<\/h4>\n<p>&#8211; <strong>gouv.fr<\/strong> (2024). <em>Demandes de valeurs fonci\u00e8res (DVF)<\/em>, Retrieved from <a href=\"https:\/\/www.data.gouv.fr\/fr\/datasets\/5c4ae55a634f4117716d5656\/\">https:\/\/www.data.gouv.fr\/fr\/datasets\/5c4ae55a634f4117716d5656\/<\/a>.<br \/>&#8211; <strong>Merckel, L.<\/strong> (2024a). <em>Data-Driven or Data-Derailed? Lessons from the Hello-World Classifier<\/em>. Retrieved from <a href=\"https:\/\/619.io\/blog\/2024\/11\/28\/data-driven-or-data-derailed\/\">https:\/\/619.io\/blog\/2024\/11\/28\/data-driven-or-data-derailed\/<\/a>.<br \/>&#8211; <strong>Merckel, L.<\/strong> (2024b). <em>The Crystal Ball Fallacy: What Perfect Predictive Models Really Mean<\/em>. Retrieved from <a href=\"https:\/\/619.io\/blog\/2024\/12\/03\/the-crystal-ball-fallacy\/\">https:\/\/619.io\/blog\/2024\/12\/03\/the-crystal-ball-fallacy\/<\/a>.<br \/>&#8211; <strong>Simons, J. H.<\/strong> (2013). <em>Mathematics, Common Sense, and Good Luck: My Life and Careers. <\/em>Video lecture. YouTube. <a href=\"https:\/\/www.youtube.com\/watch?v=SVdTF4_QrTM\">https:\/\/www.youtube.com\/watch?v=SVdTF4_QrTM<\/a>.<\/p>\n<p>Art and text by Loic Merckel. Licensed under <a href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY 4.0<\/a>. Originally published on <a href=\"https:\/\/619.io\/\">619.io<\/a>. For discussions or engagement, feel free to refer to the <a href=\"https:\/\/www.linkedin.com\/pulse\/when-averages-lie-moving-beyond-single-point-loic-merckel-jptxe\">LinkedIn version<\/a> or <a href=\"https:\/\/medium.com\/@loic.merckel\/when-averages-lie-moving-beyond-single-point-predictions-23201e8c04c8\">Medium version<\/a>. Otherwise, attribute the <a href=\"https:\/\/www.619.io\/blog\/2024\/12\/17\/when-averages-lie\/\">original source<\/a> when sharing or\u00a0reusing.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/medium.com\/_\/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=23201e8c04c8\" width=\"1\" height=\"1\" alt=\"\"><\/p>\n<hr>\n<p><a href=\"https:\/\/towardsdatascience.com\/when-averages-lie-moving-beyond-single-point-predictions-23201e8c04c8\">When Averages Lie: Moving Beyond Single-Point Predictions<\/a> was originally published in <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a> on Medium, where people are continuing the conversation by highlighting and responding to this story.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Loic Merckel<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/medium.com\/m\/global-identity-2?redirectUrl=https%3A%2F%2Ftowardsdatascience.com%2Fwhen-averages-lie-moving-beyond-single-point-predictions-23201e8c04c8\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>When Averages Lie: Moving Beyond Single-Point Predictions The Case for Predicting Full Probability Distributions in Decision-Making Some people like hot coffee, some people like iced coffee, but no one likes lukewarm coffee. Yet, a simple model trained on coffee temperatures might predict that the next coffee served should be\u2026 lukewarm. This illustrates a fundamental problem [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,83,312,240,520,238],"tags":[830,831,334],"class_list":["post-727","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-data-science","category-decision-making","category-editors-pick","category-predictive-modeling","category-statistics","tag-coffee","tag-like","tag-when"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/727"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=727"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/727\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=727"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=727"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=727"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}