{"id":3479,"date":"2025-05-01T07:02:25","date_gmt":"2025-05-01T07:02:25","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/05\/01\/beyond-glorified-curve-fitting-exploring-the-probabilistic-foundations-of-machine-learning\/"},"modified":"2025-05-01T07:02:25","modified_gmt":"2025-05-01T07:02:25","slug":"beyond-glorified-curve-fitting-exploring-the-probabilistic-foundations-of-machine-learning","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/05\/01\/beyond-glorified-curve-fitting-exploring-the-probabilistic-foundations-of-machine-learning\/","title":{"rendered":"Beyond Glorified Curve Fitting: Exploring the Probabilistic Foundations of Machine Learning"},"content":{"rendered":"<p>    Beyond Glorified Curve Fitting: Exploring the Probabilistic Foundations of Machine Learning<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p class=\"wp-block-paragraph\"><mdspan datatext=\"el1746060446031\" class=\"mdspan-comment\">You see<\/mdspan> a math formula you don\u2019t immediately understand.<\/p>\n<p class=\"wp-block-paragraph\">Your instinct? Stop reading.<\/p>\n<p class=\"wp-block-paragraph\">Don\u2019t.<\/p>\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/04\/Probabilistic-View-on-Machine-Learning.png?ssl=1\" alt=\"\" class=\"wp-image-602600\"><\/figure>\n<p class=\"wp-block-paragraph\">That\u2019s exactly what I told myself when I started reading <em>Probabilistic Machine Learning \u2013 An Introduction<\/em> by Kevin P. Murphy.<\/p>\n<p class=\"wp-block-paragraph\">And it was absolutely worth it.<\/p>\n<p class=\"wp-block-paragraph\">It changed how I think about machine learning.<\/p>\n<p class=\"wp-block-paragraph\">Sure, some formulas might look complicated at first glance.<\/p>\n<p class=\"wp-block-paragraph\">But let\u2019s look at the formula to see that what it describes is simple.<\/p>\n<p class=\"wp-block-paragraph\">When a machine learning model makes a prediction (for example, a classification), what is it really doing?<\/p>\n<p class=\"wp-block-paragraph\">It\u2019s distributing probabilities across all possible outcomes \/ classes.<\/p>\n<p class=\"wp-block-paragraph\">And those probabilities must always add up to 100 % \u2014 or 1.<\/p>\n<p class=\"wp-block-paragraph\">Let\u2019s take a look at an example: Imagine we show the model an image of an animal and ask: \u201cWhat animal is this?\u201d<\/p>\n<p class=\"wp-block-paragraph\">The model might respond:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Cat: 85%<\/li>\n<li class=\"wp-block-list-item\">Dog: 10%<\/li>\n<li class=\"wp-block-list-item\">Fox: 5%<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">Add them up?<br \/>Exactly 100%.<\/p>\n<p class=\"wp-block-paragraph\">This means the model believes it\u2019s most likely a cat \u2014 but it\u2019s also leaving a small chance for dog or fox.<\/p>\n<p class=\"wp-block-paragraph\">This simple formula reminds us that machine learning models can not only give us an answer (=It\u2019s a cat!), but also reveal how confident they are in their prediction.<\/p>\n<p class=\"wp-block-paragraph\">And we can use this uncertainty to make better decisions.<\/p>\n<figure class=\"wp-block-image size-full is-resized\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/04\/Introducing-the-Probabilistic-View-of-Machine-Learning.png?ssl=1\" alt=\"\" class=\"wp-image-602599\" style=\"width:680px;height:auto\"><figcaption class=\"wp-element-caption\">Own visualization \u2014 Illustrations from <a href=\"http:\/\/undraw.com\/\" data-type=\"link\" data-id=\"unDraw.com\">unDraw.com<\/a>.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\"><em><strong>Table of Contents<\/strong><br \/><a data-type=\"internal\" data-id=\"#01\" href=\"https:\/\/towardsdatascience.com\/#01\">1\u00a0What does machine learning from a probabilistic view mean?<\/a><br \/><a data-type=\"internal\" data-id=\"#02\" href=\"https:\/\/towardsdatascience.com\/#02\">2 So, what is supervised learning?<\/a><br \/><a href=\"https:\/\/towardsdatascience.com\/#03\">3\u00a0So, what is unsupervised learning?<\/a><br \/><a href=\"https:\/\/towardsdatascience.com\/#04\">4\u00a0So, and what is reinforcement learning?<\/a><br \/><a href=\"https:\/\/towardsdatascience.com\/#05\">5 From a mathematical perspective: What are we actually learning?<\/a><br \/><a href=\"https:\/\/towardsdatascience.com\/#06\">Final Thought \u2014 What\u2019s the point of understanding the probabilistic view anyway?<\/a><br \/><a href=\"https:\/\/towardsdatascience.com\/#07\">Where Can You Continue Learning?<\/a><\/em><\/p>\n<h2 class=\"wp-block-heading\" id=\"01\">What does machine learning from a probabilistic view mean?<\/h2>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Tom_M._Mitchell\">Tom Mitchell<\/a>, an American computer scientist, defines machine learning as follows:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">&gt;<em> A computer program is said to learn from experience E with respect to some class of tasks T, and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\"><strong>Let\u2019s break this down:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\n<strong>T (Task)<\/strong>: The task to be solved, such as classifying images or predicting the amount of electricity needed for purchasing.<\/li>\n<li class=\"wp-block-list-item\">\n<strong>E (Experience)<\/strong>: The experience the model learns from. For example, training data such as images or past electricity purchases versus actual consumption.<\/li>\n<li class=\"wp-block-list-item\">\n<strong>P (Performance Measure)<\/strong>: The metric used to evaluate performance, such as accuracy, error rate or mean squared error (MSE).<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\">Where does the probabilistic view come in?<\/h3>\n<p class=\"wp-block-paragraph\">In classical machine learning, a value is often simply predicted:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">&gt; <em>\u201cThe house price is 317k CHF.\u201d<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">The probabilistic view, however, focuses on learning probability distributions.<\/p>\n<p class=\"wp-block-paragraph\">Instead of generating fixed predictions, we are interested in how likely which different outcomes (in this example prices) are.<\/p>\n<p class=\"wp-block-paragraph\">Everything that is uncertain \u2014 outputs, parameters, predictions \u2014 is treated as a random variable.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">In the case of a house price, there might still be negotiation opportunities or risks that are mitigated through mechanisms like insurance. <\/p>\n<p class=\"wp-block-paragraph\">But let\u2019s now look at an example where it is really crucial for good decisions that the uncertainty is explicitly modelled:<\/p>\n<p class=\"wp-block-paragraph\">Imagine an energy supplier who needs to decide today how much electricity to buy.<\/p>\n<p class=\"wp-block-paragraph\">The uncertainty lies in the fact that energy demand depends on many factors: temperature, weather, the economic situation, industrial production, self-production through photovoltaic systems and so on. All of which are uncertain variables.<\/p>\n<h3 class=\"wp-block-heading\">And where does probability help us now?<\/h3>\n<p class=\"wp-block-paragraph\">If we rely only on a single best estimate, we risk either:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">that we have too much energy (leading to costly overproduction).<\/li>\n<li class=\"wp-block-list-item\">that we have too little energy (causing a supply gap).<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">With a probability calculation, on the other hand, we can plan that there is a 95% probability that demand will remain below 850 MWh, for example. And this, in turn, allows us to calculate the safety buffer correctly \u2014 not based on a single point prediction, but on the entire range of possible outcomes.<\/p>\n<p class=\"wp-block-paragraph\">If we have to make an optimal decision under uncertainty, this is only possible if we explicitly model the uncertainty.<\/p>\n<h3 class=\"wp-block-heading\">Why is this important?<\/h3>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\n<strong>Making better decisions under uncertainty:<\/strong><br \/>If our model understands uncertainty, we can better weigh risks. For example, in credit scoring, a customer labelled as an \u2018unsafe customer\u2019 could trigger additional verification steps.<\/li>\n<li class=\"wp-block-list-item\">\n<strong>Increasing trust and interpretability:<\/strong><br \/>For us humans, probabilities are more tangible than rigid point predictions. Probabilistic outputs help stakeholders understand not only what a model predicts, but also how confident it is in its predictions.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">To understand why the probabilistic view is so powerful, we need to look at how machines actually learn (Supervised Learning, Unsupervised Learning or <a href=\"https:\/\/towardsdatascience.com\/tag\/reinforcement-learning\/\" title=\"Reinforcement Learning\">Reinforcement Learning<\/a>). So, this is next.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dotted\">\n<h2 class=\"wp-block-heading\">Many machine learning models are deterministic \u2014 but the world is uncertain:<\/h2>\n<h3 class=\"wp-block-heading\" id=\"02\">So, what is supervised learning?<\/h3>\n<p class=\"wp-block-paragraph\">In simple terms, <a href=\"https:\/\/towardsdatascience.com\/tag\/supervised-learning\/\" title=\"Supervised Learning\">Supervised Learning<\/a> means that we have examples \u2014 and for each example, we know what it means.<\/p>\n<p class=\"wp-block-paragraph\">For instance:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><em>&gt; If you see this picture (input x), then the flower is called Setosa (output y).<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">The aim is to find a rule that makes good predictions for new, unseen inputs. Typical examples of supervised learning tasks are classification or regression.<\/p>\n<h3 class=\"wp-block-heading\">What does the probabilistic view add?<\/h3>\n<p class=\"wp-block-paragraph\">The probabilistic view reminds us that there is no absolute certainty in the real world.<\/p>\n<p class=\"wp-block-paragraph\"><strong>In the real world, nothing is perfectly predictable.<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Sometimes information is missing \u2014 this is known as epistemic uncertainty.<\/li>\n<li class=\"wp-block-list-item\">Sometimes the world is inherently random \u2014 this is known as aleatoric uncertainty.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">Therefore, instead of working with a single \u2018fixed answer\u2019, probabilistic models work with probabilities:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">&gt; <em>\u201cThe model is 95% certain to be a Setosa.\u201d<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">This way, the model does not just guess, but also expresses how confident it is.<\/p>\n<h3 class=\"wp-block-heading\">And what about the No Free Lunch Theorem?<\/h3>\n<p class=\"wp-block-paragraph\">In machine learning, there is no single \u201cbest method\u201d that works for every problem.<\/p>\n<p class=\"wp-block-paragraph\">The No Free Lunch Theorem tells us:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">&gt; <em>If an algorithm performs particularly well on a certain type of task, it will perform worse on other types of tasks.<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">Why is that?<\/p>\n<p class=\"wp-block-paragraph\">Because every algorithm makes assumptions about the world. These assumptions help in some situations \u2014 and hurt in others.<\/p>\n<p class=\"wp-block-paragraph\">Or as <a href=\"https:\/\/en.wikipedia.org\/wiki\/All_models_are_wrong\">George Box<\/a> famously said:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><em>&gt; All models are wrong, but some models are useful.<\/em><\/p>\n<\/blockquote>\n<h3 class=\"wp-block-heading\">Supervised learning as \u201cglorified curve fitting\u201d<\/h3>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/arxiv.org\/abs\/1801.04016\">J. Pearl<\/a> describes supervised learning as \u2018glorified curve fitting\u2019.<\/p>\n<p class=\"wp-block-paragraph\">What he meant is that supervised learning is, at its core, about connecting known points (x, y) as smoothly as possible \u2014 like drawing a clever curve through data.<\/p>\n<p class=\"wp-block-paragraph\">In contrast, <a href=\"https:\/\/towardsdatascience.com\/tag\/unsupervised-learning\/\" title=\"Unsupervised Learning\">Unsupervised Learning<\/a> is about making sense of the data without any labels \u2014 trying to understand the underlying structure without a predetermined target.<\/p>\n<h3 class=\"wp-block-heading\" id=\"03\">So, what is unsupervised learning?<\/h3>\n<p class=\"wp-block-paragraph\">Unsupervised learning means that the model receives data \u2014 but no explanations or labels.<\/p>\n<p class=\"wp-block-paragraph\">For example:<\/p>\n<p class=\"wp-block-paragraph\">When the model sees an image (input x), it is not told whether it is a Setosa, Versicolor or Virgnica.<\/p>\n<p class=\"wp-block-paragraph\">The model has to find out for itself whether there are groups, patterns or structures in the data. A typical example of unsupervised learning is clustering.<\/p>\n<p class=\"wp-block-paragraph\">The aim is therefore not to learn a fixed rule, but to better understand the hidden structure of the world.<\/p>\n<h3 class=\"wp-block-heading\">How does the probabilistic view help us here?<\/h3>\n<p class=\"wp-block-paragraph\">We are not trying to say:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">&gt; <em>\u201cThis picture is definitely a Setosa.\u201d<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">but rather:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">&gt; <em>\u201cWhat structures or patterns are probably hidden in the data?\u201d<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">Probabilistic thinking allows us to capture uncertainty and diversity in possible explanations. Instead of forcing a hard classification, we model possibilities.<\/p>\n<h3 class=\"wp-block-heading\">Why do we need unsupervised learning?<\/h3>\n<p class=\"wp-block-paragraph\">Sometimes there are no labels for the data \u2014 or they would be very expensive or difficult to collect (e.g. medical diagnoses).<\/p>\n<p class=\"wp-block-paragraph\">Sometimes the categories are not clearly defined (for example, when exactly an action starts and when it is finished).<\/p>\n<p class=\"wp-block-paragraph\">Or sometimes the task of the model is to discover patterns that we do not yet recognise ourselves.<\/p>\n<p class=\"wp-block-paragraph\">Let\u2019s look at an example:<\/p>\n<p class=\"wp-block-paragraph\">Imagine we have a collection of animal images \u2014 but we don\u2019t tell the model which animal is shown.<\/p>\n<p class=\"wp-block-paragraph\">The task is: The model should group similar animals together. Purely based on patterns it can detect.<\/p>\n<h2 class=\"wp-block-heading\" id=\"04\">So, and what is reinforcement learning?<\/h2>\n<p class=\"wp-block-paragraph\">Reinforcement learning means that a system learns from experience by acting and receiving feedback about whether its actions were good or bad.<\/p>\n<p class=\"wp-block-paragraph\">In other words:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">The system sees a situation (input x).<\/li>\n<li class=\"wp-block-list-item\">The system selects an action (a).<\/li>\n<li class=\"wp-block-list-item\">The system receives a reward or punishment.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">In simple words, it\u2019s actually similar to how we train a dog.<\/p>\n<p class=\"wp-block-paragraph\">Let\u2019s take a look at an example:<\/p>\n<p class=\"wp-block-paragraph\">A robot is trying to learn how to walk. It tries out various movements. If the roboter falls over, it learns, that action was bad. If the robot manages a few steps, it gets a positive reward.<\/p>\n<p class=\"wp-block-paragraph\">Behind the scenes, the robot builds a strategy or a rule called a policy \u03c0(x):<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">&gt; <em>\u201cIn situation x, choose action a.\u201d<\/em><\/p>\n<\/blockquote>\n<p class=\"wp-block-paragraph\">Initially, these rules are purely random or very bad. The robot is in the exploration phase to find out what works and what does not. Through each experience (e.g. falling or walking), the robot receives feedback (rewards) such as +1 point for standing upright, -10 points for falling over.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">Over time, the robot adjusts its policy to prefer actions that lead to higher cumulative rewards. It changes its rule \u03c0(x) to make more out of good experiences and avoid bad experiences.<\/p>\n<p class=\"wp-block-paragraph\">What is the robot\u2019s goal?<\/p>\n<p class=\"wp-block-paragraph\">The robot wants to find actions that bring the highest reward over time (e.g. staying upright, moving forwards).<\/p>\n<p class=\"wp-block-paragraph\">Mathematically, the robot tries to maximise its expected future reward value.<\/p>\n<h3 class=\"wp-block-heading\">How does the probabilistic view help us?<\/h3>\n<p class=\"wp-block-paragraph\">The system (in this example the robot) often does not know exactly which of its many actions has led to the reward. This means that it has to learn under uncertainty which strategies (policies) are good.<\/p>\n<p class=\"wp-block-paragraph\">In reinforcement learning, we are therefore trying to learn a policy:<\/p>\n<p class=\"wp-block-paragraph\"><em>\u03c0(x)<\/em><\/p>\n<p class=\"wp-block-paragraph\">This policy defines, which action should the system perform in which situation to maximize rewards over time.<\/p>\n<h3 class=\"wp-block-heading\">Why is reinforcement learning so fascinating?<\/h3>\n<p class=\"wp-block-paragraph\">Reinforcement learning mirrors the way humans and animals learn.<\/p>\n<p class=\"wp-block-paragraph\">It is perfect for tasks where there are no clear examples, but where improvement comes through experience.<\/p>\n<p class=\"wp-block-paragraph\">The film AlphaGo and the breakthrough are based on reinforcement learning.<\/p>\n<h2 class=\"wp-block-heading\" id=\"05\">From a mathematical perspective: What are we actually learning?<\/h2>\n<p class=\"wp-block-paragraph\">When we talk about a model in machine learning, we mean more than just a function in the probabilistic view.<\/p>\n<p class=\"wp-block-paragraph\">A model is a distributional assumption about the world.<\/p>\n<p class=\"wp-block-paragraph\"><strong>Let\u2019s take a look at the classical view:<\/strong><\/p>\n<p class=\"wp-block-paragraph\">A model is a function f(x)=y that translates an input into an output.<\/p>\n<p class=\"wp-block-paragraph\"><strong>Let\u2019s now take a look at the probabilistic view:<\/strong><\/p>\n<p class=\"wp-block-paragraph\">A model explicitly describes uncertainty \u2014 for example in f(x)=p(y\u2223x).<\/p>\n<p class=\"wp-block-paragraph\">It is not about providing one \u201cbest answer\u201d, but about modelling how likely different answers are.<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">In supervised learning, we learn a function that describes the conditional probability p(y|x):<br \/>The probability of a label y, given an input x.<br \/>We ask: \u201cWhat is the correct answer to this input?\u201d<br \/>Formula: f(x)=p(y\u2223x)<\/li>\n<li class=\"wp-block-list-item\">In unsupervised learning, we learn a function that describes the probability distribution p(x) of the input data:<br \/>The probability of the data itself, without explicit target values.<br \/>We ask, \u2018How probable is this data itself?\u2019.<br \/>Formula: f(x)=p(x)<\/li>\n<li class=\"wp-block-list-item\">In reinforcement, we learn a policy \u03c0(x) that determines the optimal action a for a state x:<br \/>A rule that suggests an action a for every possible state x, which brings as much reward as possible in the long term.<br \/>We ask: \u2018Which action should be carried out now so that the system receives the best reward in the long term?<br \/>Formula: a=\u03c0(x)<\/li>\n<\/ul>\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dotted\">\n<p class=\"wp-block-paragraph\"><em>On my <\/em><a href=\"https:\/\/sarahleaschrch.substack.com\/\"><em>Substack<\/em><\/a><em>, I regularly write summaries about the published articles in the fields of Tech, Python, Data Science, Machine Learning and AI. If you\u2019re interested, take a look or subscribe.<\/em><\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dotted\">\n<h2 class=\"wp-block-heading\" id=\"06\">Final Thought \u2014 What\u2019s the point of understanding the probabilistic view, anyway?<\/h2>\n<p class=\"wp-block-paragraph\">In the real world, almost nothing is truly certain.<\/p>\n<p class=\"wp-block-paragraph\">Uncertainty, incomplete information and randomness characterise every decision we make.<\/p>\n<p class=\"wp-block-paragraph\">Probabilistic machine learning helps us to deal with exactly that.<\/p>\n<p class=\"wp-block-paragraph\">Instead of just trying to be \u201cmore accurate\u201d, a probabilistic approach becomes:<\/p>\n<ol class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">More robust against errors and uncertainties.<br \/>For example, in a medical diagnostic system, we want a model that indicates its uncertainty (\u2018it is 60 % certain that it is cancer\u2019) instead of making a fixed diagnosis. In this way, additional tests can be carried out if there is a high degree of uncertainty.<\/li>\n<li class=\"wp-block-list-item\">More flexible and therefore more adaptable to new situations.<br \/>For example, a model that models weather data probabilistically can react more easily to new climate conditions because it learns about uncertainties.<\/li>\n<li class=\"wp-block-list-item\">More comprehensible and interpretable, in that models not only give us an answer, but also how certain they are.<br \/>For example, in a credit scoring system, we can show stakeholders that the model is 90% certain that a customer is creditworthy. The remaining 10% uncertainty is explicitly communicated \u2014 this helps with transparent decisions and risk assessments.<\/li>\n<\/ol>\n<p class=\"wp-block-paragraph\">These advantages make probabilistic models more transparent, trustworthy and interpretable systems (instead of black box algorithms).<\/p>\n<h2 class=\"wp-block-heading\" id=\"07\">Where Can You Continue Learning?<\/h2>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/probml.github.io\/pml-book\/book1.html\">Book\u00a0 \u2014 Probabilistic Machine Learning: An Introduction by Kevin P. Murphy<\/a><\/li>\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/probml.github.io\/pml-book\/book2.html\">Book \u2014 Probabilistic Machine Learning: Advanced Topics by Kevin P. Murphy<\/a><\/li>\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/www.geeksforgeeks.org\/supervised-vs-reinforcement-vs-unsupervised\/\">GeeksForGeeks Blog \u2014 Supervised vs Unsupervised vs Reinforcement Learning<\/a><\/li>\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/blogs.nvidia.com\/blog\/supervised-unsupervised-learning\/\">Nvidia Blog \u2014 SuperVize Me: What\u2019s the Difference Between Supervised, Unsupervised, Semi-Supervised and Reinforcement Learning?<\/a><\/li>\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/www.datacamp.com\/blog\/supervised-machine-learning\">DataCamp Blog \u2014 Supervised Machine Learning<\/a><\/li>\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/www.datacamp.com\/blog\/introduction-to-unsupervised-learning\">DataCamp Blog \u2014 Introduction to Unsupervised Learning<\/a><\/li>\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/www.datacamp.com\/cheat-sheet\/unsupervised-machine-learning-cheat-sheet\">DataCamp Blog \u2014 Unsupervised Machine Learning Cheat Sheet<\/a><\/li>\n<\/ul>\n<p>The post <a href=\"https:\/\/towardsdatascience.com\/beyond-glorified-curve-fitting-exploring-the-probabilistic-foundations-of-machine-learning\/\">Beyond Glorified Curve Fitting: Exploring the Probabilistic Foundations of Machine Learning<\/a> appeared first on <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a>.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Sarah Sch\u00fcrch<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/beyond-glorified-curve-fitting-exploring-the-probabilistic-foundations-of-machine-learning\/\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Beyond Glorified Curve Fitting: Exploring the Probabilistic Foundations of Machine Learning You see a math formula you don\u2019t immediately understand. Your instinct? Stop reading. Don\u2019t. That\u2019s exactly what I told myself when I started reading Probabilistic Machine Learning \u2013 An Introduction by Kevin P. Murphy. And it was absolutely worth it. It changed how I [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,240,70,504,238,2524,674],"tags":[199,341,41],"class_list":["post-3479","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-editors-pick","category-machine-learning","category-reinforcement-learning","category-statistics","category-supervised-learning","category-unsupervised-learning","tag-learning","tag-machine","tag-what"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/3479"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=3479"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/3479\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=3479"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=3479"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=3479"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}