{"id":2330,"date":"2025-03-11T07:02:53","date_gmt":"2025-03-11T07:02:53","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/03\/11\/experiments-illustrated-how-random-assignment-saved-us-1m-in-marketing-spend\/"},"modified":"2025-03-11T07:02:53","modified_gmt":"2025-03-11T07:02:53","slug":"experiments-illustrated-how-random-assignment-saved-us-1m-in-marketing-spend","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/03\/11\/experiments-illustrated-how-random-assignment-saved-us-1m-in-marketing-spend\/","title":{"rendered":"Experiments Illustrated: How Random Assignment Saved Us $1M in Marketing Spend"},"content":{"rendered":"<p>    Experiments Illustrated: How Random Assignment Saved Us $1M in Marketing Spend<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p class=\"wp-block-paragraph\">Running cool experiments is easily one of my favorite parts of working in data science.<\/p>\n<p class=\"wp-block-paragraph\">Most experiments don\u2019t deliver big wins, so the winners make for fun stories. We\u2019ve had a few of these at <a href=\"https:\/\/www.intelycare.com\/\">IntelyCare<\/a>, and I\u2019m sharing each story in a way that highlights a concept related to experimentation.<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/towardsdatascience.com\/experiments-illustrated-how-we-optimized-premium-listings-on-our-nursing-job-board\/\">Georandomization &amp; How we optimized premium listings on our job board<\/a><\/li>\n<li class=\"wp-block-list-item\">Knowing what to test &amp; How IntelyCare tested its referral bonus program into existence (coming soon).<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">And in this post, we\u2019ll share a story about how we avoided doing something stupid by running an experiment first, and using it to discuss the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Multiple_comparisons_problem\">multiple comparisons problem<\/a>.<\/p>\n<h2 class=\"wp-block-heading\">Background: IntelyCare hires nurses at scale\u2026 and it\u2019s covid <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f637.png?ssl=1\" alt=\"\ud83d\ude37\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"><br \/>\n<\/h2>\n<p class=\"wp-block-paragraph\">IntelyCare connects nurses with work opportunities ranging from full-time work to individual shifts. When dealing with individual shifts, clinicians work for IntelyCare as employees (agency model). This means we\u2019re hiring nurses 24\/7.<\/p>\n<p class=\"wp-block-paragraph\">You may have suppressed this memory, but in 2020 and 2021 we had this global pandemic. Hiring nurses during the pandemic was nothing short of a rock fight. We had full business permission to try everything and anything that could help us hire nurses more quickly and efficiently.<\/p>\n<h2 class=\"wp-block-heading\">The problem: Lots of applies, but not so many new hires<\/h2>\n<p class=\"wp-block-paragraph\">Working anywhere in healthcare means submitting a big pile of paperwork \u2014 licenses, immunizations, certifications, and more in addition to the regular resumes, references, and background checks.<\/p>\n<p class=\"wp-block-paragraph\">IntelyCare is no different. And even though we make it all phone-friendly and digital, submitting all this paperwork is about as fun as filing your taxes. And that means many people who apply give up somewhere between creating an account and completing a shift.<\/p>\n<h2 class=\"wp-block-heading\">The solution: Just throw money at it! <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f4b8.png?ssl=1\" alt=\"\ud83d\udcb8\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"><br \/>\n<\/h2>\n<p class=\"wp-block-paragraph\">We tried lots of things (including different referral incentives). One easy-to-try proposal was to just pay clinicians an extra $100 when they complete their first shift.<\/p>\n<p class=\"wp-block-paragraph\">Why $100? Because it\u2019s a nice round number and looks good on <a href=\"https:\/\/towardsdatascience.com\/tag\/marketing\/\" title=\"Marketing\">Marketing<\/a> materials. You might be surprised how many business decisions are made this way (unless you\u2019re in marketing, in which case it\u2019s perfectly normal).<\/p>\n<p class=\"wp-block-paragraph\">The idea was so easy we almost went live without a test. There was a lot of pressure to move quickly and we wanted to be fast. But science prevailed and instead of offering $100 to everybody, we randomly offered bonuses ranging from $0 to $100 in increments of $25.<\/p>\n<p class=\"wp-block-paragraph\">Clinicians were informed of the bonus via email throughout the application process. (Unless you had a $0 bonus \u2014 no email for you.)<\/p>\n<p class=\"wp-block-paragraph\">We ran this test for several months to give candidates sufficient time to complete their applications. By the time we circled back to make a decision, we had several thousand applicants at each bonus level.<\/p>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Spillover_(experiment)\">Spillovers<\/a>? It\u2019s always a possibility but it seems unlikely. Demand for nursing talent was insanely high at the time. I have a hard time imagining clinicians with high bonuses stealing all the shifts from those with bonuses (thereby exaggerating the impact of the high bonus). There were plenty of shifts to go around.<\/p>\n<h2 class=\"wp-block-heading\">Technical aside: Multiple comparisons<\/h2>\n<p class=\"wp-block-paragraph\">If you ever run a test like this, chances are some higher up will ask you to \u201cslice and dice\u201d or \u201ccut\u201d or perhaps \u201cdig into\u201d the data 100 different ways. This is fun <a href=\"https:\/\/xkcd.com\/882\/\">but also dangerous<\/a>. Wait, dangerous?! Let\u2019s discuss.<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Datasets are finite and noisy, which means anytime you test a hypothesis using your dataset there\u2019s a chance your answers are incorrect. Sorry, I didn\u2019t make the rules.\n<\/li>\n<li class=\"wp-block-list-item\">To understand the risk of an incorrect answer, we look at the <em>variance<\/em> of a dataset. Knowing the variance helps us know if a statistic is \u201cclose\u201d or \u201cfar away\u201d from another possible answer. (e.g. \u201cDoes a marketing campaign have a non-zero impact on sales?\u201d)\n<\/li>\n<li class=\"wp-block-list-item\">Suppose, given the amount of noise in my data, there\u2019s a 5% chance I draw a false conclusion for a given hypothesis. I\u2019m curious to know if a marketing campaign increased sales, and my boss wants to know how the impact differs for men, women, old people, young people, people in Idaho, people in Florida, \u2026 etc. See the danger now? If I ask 20 questions, good chance at least one of the answers is wrong. And if that means your company starts marketing like crazy to teenagers in Idaho, that could be an expensive mistake!\n<\/li>\n<li class=\"wp-block-list-item\">While your slicing and dicing isn\u2019t a machine-learning model, you can <em>overfit<\/em> your analysis by asking too many questions. Just as machine-learning engineers have ways to avoid overfitting models, analysts need ways to avoid drawing overfit conclusions from a finite dataset.\u00a0<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\">Call before you dig: 1-BON-FER-RONI<\/h3>\n<p class=\"wp-block-paragraph\">So what is an analyst to do? There are many heuristics, all of which make it harder to reject a null hypothesis.<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Adjust p-values required for \u201cstatistical significance\u201d (<a href=\"https:\/\/medium.com\/@Dr_nabil_ebraheim\/understanding-the-bonferroni-correction-guarding-against-false-positives-in-multiple-comparisons-8d4cdd5061db\">Bonferroni correction<\/a>).<\/li>\n<li class=\"wp-block-list-item\">Use a ranking of p-values to determine when to stop considering a result as significant (<a href=\"https:\/\/medium.com\/@jorgepit-14189\/the-benjamini-hochberg-procedure-fdr-and-p-value-adjusted-explained-5577f722a2ac\">Benjamini-Hochberg<\/a>).<\/li>\n<li class=\"wp-block-list-item\">Instead of taking the experiment results at face value, use them to update some Bayesian prior representing your current-best view of the world (<a href=\"https:\/\/www.stat.colostate.edu\/~jah\/papers\/statsci.pdf\">Bayesian Model Averaging<\/a>). You can use this to combine results from several tests, when appropriate.<\/li>\n<li class=\"wp-block-list-item\">\n<a href=\"https:\/\/medium.com\/towards-data-science\/bootstrapping-statistics-what-it-is-and-why-its-used-e2fa29577307\">Bootstrapping<\/a> \u2014 sample from the experimental data with replacement, compute your test statistic, repeat a zillion times, and then consider a full distribution of test statistics. Bootstrapping does not immediately solve your multiple comparisons problem, but knowing the variance of your test statistics can help you be a more critical consumer of p-values.<\/li>\n<li class=\"wp-block-list-item\">\n<a href=\"https:\/\/towardsdatascience.com\/understanding-group-sequential-testing-befb35cec07a\/\">Dynamic stopping rules<\/a>. List out your hypotheses. As results come in, stop testing each hypothesis as soon as the evidence is clear <em>but continue to test other hypotheses with additional data.<\/em> Eventually, you run out of data or you run out of hypotheses. Why do we not revisit our prior hypotheses with the additional data? Because we\u2019d be right back in multiple comparisons hell. The sequential nature of the exercise ties our hands to the mast so we don\u2019t go swimming after sirens.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">If you\u2019re interested in a more detailed summary, I\u2019d recommend the following:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">StatSig: <a href=\"https:\/\/www.statsig.com\/glossary\/correction-for-multiple-comparisons\">Correction for Multiple Comparisons<\/a>\n<\/li>\n<li class=\"wp-block-list-item\">John McDonald\u2019s <a href=\"https:\/\/stats.libretexts.org\/Bookshelves\/Applied_Statistics\/Biological_Statistics_(McDonald)\/06%3A_Multiple_Tests\/6.01%3A_Multiple_Comparisons\">spreadsheet examples<\/a>\n<\/li>\n<li class=\"wp-block-list-item\">Spotify Engineering: <a href=\"https:\/\/engineering.atspotify.com\/2023\/03\/choosing-sequential-testing-framework-comparisons-and-discussions\/\">Choosing a Sequential Testing Framework<\/a>\n<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\">Back to the bonuses<\/h2>\n<p class=\"wp-block-paragraph\">We\u2019re a curious bunch and so considered looking at several cuts of our experiment data: location, age, qualification, and more. Wouldn\u2019t it be amazing if bonuses were ineffective for nurses\u2026 except for nurses younger than 30 years old living in Rhode Island with active Netflix accounts? Many marketing teams are ready to jump at exactly these kinds of \u201cpatterns\u201d and I\u2018m kindly going to ask you to show me your Bonferroni receipts.<\/p>\n<p class=\"wp-block-paragraph\">After taking multiple comparisons into account, we found <em>one<\/em> dimension that was truly meaningful \u2014 whether the applicant was a nurse or a nursing assistant (CNA).<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"eddfda\" data-has-transparency=\"false\" style=\"--dominant-color: #eddfda;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"617\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-illustrated-1-1024x617.png?resize=1024%2C617&#038;ssl=1\" alt=\"\" class=\"wp-image-599379 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-illustrated-1-1024x617.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-illustrated-1-300x181.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-illustrated-1-768x463.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-illustrated-1-1536x925.png 1536w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-illustrated-1.png 1600w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\"><em>Note how the bonuses differ from the \u201cNo Bonus\u201d group. (image by author)<\/em><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Without a bonus, nurses and nursing assistants went on to complete a shift at about the same rate. Nursing assistants were more likely to start working with a bonus of any amount. Nurses, on the other hand, were <em>less likely<\/em> to start working! (And yes these are all stat sig different from no bonus, for all you skeptics out there).<\/p>\n<p class=\"wp-block-paragraph\">For any readers from outside healthcare, it\u2019s important to know that nurses can easily earn between 2X and 4X the hourly rate of a nursing assistant. These populations differ in so many ways, which is why we put this dimension at the top of our sequential-testing list.<\/p>\n<p class=\"wp-block-paragraph\">Years later, I still scratch my head at this chart and wonder why completion rates <em>decreased<\/em> among nurses when we offered <em>more<\/em> money. Maybe no gift is better than a cheap gift?<a href=\"https:\/\/jrreport.wordandbrown.com\/2021\/07\/27\/25000-signing-bonuses-to-hospital-workers-whatever-it-takes\/\"> Hospitals at the time were offering signing bonuses as high as $25,000 for full-time work<\/a>.<\/p>\n<h2 class=\"wp-block-heading\">What\u2019s the optimal bonus amount?<\/h2>\n<p class=\"wp-block-paragraph\">After running this test, we did away with bonuses for nurses. Maybe some bonus greater than $100 would have improved our funnel metrics? That\u2019s another test for another day.<\/p>\n<p class=\"wp-block-paragraph\">For CNAs, note the large difference between the no bonus group and the $25 bonus group (nearly 5 full percentage points). From there, each additional $25 has a much smaller effect, and somewhere between $50 and $100 the marginal benefit from bigger bonuses reaches zero. We ended up going with $25 to give us room to bump things up at specific times and places as needed.<\/p>\n<p class=\"wp-block-paragraph\">Remember the initial proposal was to give $100 to <em>everyone. <\/em>Had we done that, <strong>we would have spent $1M extra in bonuses in one year<\/strong> and would likely have recruited the same number of people.<\/p>\n<h2 class=\"wp-block-heading\">Key takeaways for those who made it this far<\/h2>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">You don\u2019t need fancy machinery to run an impactful test. For this test, all we needed was (1) random assignment and (2) a way to send 4 variations of an email. We\u2019re lucky to have a nice data warehouse and a CRM, but we honestly could have run this off spreadsheets.<\/li>\n<li class=\"wp-block-list-item\">We have a strong preference for nice, round numbers in our promotions. But we found a $25 bonus was basically as effective as a $100 bonus. We\u2019ve run other tests that show bonuses are more about timing and presentation vs the sheer dollar amount.<\/li>\n<li class=\"wp-block-list-item\">It\u2019s tempting to cut a dataset 900 different ways and then chase the best cuts with promotions or other interventions. This is great, but watch out for the multiple comparisons problem.<\/li>\n<\/ul>\n<p>The post <a href=\"https:\/\/towardsdatascience.com\/experiments-illustrated-how-random-assignment-saved-us-1m-in-marketing-spend\/\">Experiments Illustrated: How Random Assignment Saved Us $1M in Marketing Spend<\/a> appeared first on <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a>.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Ben Tengelsen<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/experiments-illustrated-how-random-assignment-saved-us-1m-in-marketing-spend\/\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Experiments Illustrated: How Random Assignment Saved Us $1M in Marketing Spend Running cool experiments is easily one of my favorite parts of working in data science. Most experiments don\u2019t deliver big wins, so the winners make for fun stories. We\u2019ve had a few of these at IntelyCare, and I\u2019m sharing each story in a way [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,211,83,1977,176,1710,1979],"tags":[7,1980,1981],"class_list":["post-2330","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-data-analysis","category-data-science","category-experiment-design","category-marketing","category-statisitcs","category-testing-techniques","tag-how","tag-intelycare","tag-nurses"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/2330"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=2330"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/2330\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=2330"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=2330"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=2330"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}