{"id":2329,"date":"2025-03-11T07:02:52","date_gmt":"2025-03-11T07:02:52","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/03\/11\/experiments-illustrated-how-we-optimized-premium-listings-on-our-nursing-job-board\/"},"modified":"2025-03-11T07:02:52","modified_gmt":"2025-03-11T07:02:52","slug":"experiments-illustrated-how-we-optimized-premium-listings-on-our-nursing-job-board","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/03\/11\/experiments-illustrated-how-we-optimized-premium-listings-on-our-nursing-job-board\/","title":{"rendered":"Experiments Illustrated: How We Optimized Premium Listings on Our Nursing Job Board"},"content":{"rendered":"<p>    Experiments Illustrated: How We Optimized Premium Listings on Our Nursing Job Board<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p class=\"wp-block-paragraph\">Running experiments is a task that often falls to data scientists. If that\u2019s you, congrats! It can be a rewarding and high-impact area of work, but also requires tools found outside the typical ML-heavy data science curriculum.<\/p>\n<p class=\"wp-block-paragraph\">Even with the best tools, only a small share of experiments deliver meaningful business value. I\u2019ve been lucky to design and execute many experiments. Of those, I have a few winners. From these, I\u2019m sharing some stories to illustrate key concepts related to experiments.<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/towardsdatascience.com\/experiments-illustrated-how-random-assignment-saved-us-1m-in-marketing-spend\/\">Multiple Comparisons &amp; How Random Assignment Saved us $1M in Marketing Spend<\/a><\/li>\n<li class=\"wp-block-list-item\">Choosing what to test &amp; How IntelyCare tested its referral bonus program into existence (coming soon)<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\"><strong>Background: <\/strong>I work at a company called IntelyCare. We help connect nurses with various work opportunities (full-time, part-time, contracts, per-diem\u2026 the whole menu).<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">One of our core offerings is a nursing-only <a href=\"https:\/\/www.intelycare.com\/jobs\/\">job board<\/a>. If you take a look in the year 2025, you\u2019ll notice two possible ways of sorting jobs by date and by relevance.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\"><strong>Why it matters: <\/strong>The sort-by-relevance feature is our current best lever to guarantee a good experience for paying customers. It also gives us an opportunity to improve the overall efficiency of our job board by steering eyeballs away from low-quality jobs.<\/p>\n<p class=\"wp-block-paragraph\">Unfortunately, we can\u2019t put <em>every<\/em> job at the top of a search result. We face a tradeoff between the <em>quantity<\/em> of top-page listings and the <em>quality<\/em> of the experience in the form of increased applies.<\/p>\n<p class=\"wp-block-paragraph\"><strong>How it works: \u201c<\/strong>Relevance\u201d doesn\u2019t mean what it normally means. Sorry!<\/p>\n<p class=\"wp-block-paragraph\">We give each job a score between 0 and 100. When filling a page with jobs, sorting by relevance means we sort the results by that score. That\u2019s it! For brevity, we\u2019ll say any job with a score higher than 0 is \u201cboosted.\u201d<\/p>\n<p class=\"wp-block-paragraph\">I know what you\u2019re thinking, \u201cThis isn\u2019t relevance!\u201d And you\u2019re right, at least in the normal sense of the word. The score doesn\u2019t vary across job-seekers or search terms. A better name would be \u201crelevant to Google.\u201d We\u2019re OK with that because a huge share of our job-board traffic comes from Google, as shown below.<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"272627\" data-has-transparency=\"false\" style=\"--dominant-color: #272627;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"622\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-1024x622.png?resize=1024%2C622&#038;ssl=1\" alt=\"Screenshot of Google search results\" class=\"wp-image-599407 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-1024x622.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-300x182.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-768x467.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-1536x933.png 1536w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments.png 1600w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\">\u201cSort by relevance\u201d here is shorthand for \u201crelevant to Google.\u201d (Image by author)<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\"><strong>In <a href=\"https:\/\/towardsdatascience.com\/tag\/math\/\" title=\"Math\">Math<\/a>: <\/strong>We have N jobs. Every day we generate a vector of N integers between 0 and 100. We feed this vector into a black box named Google. If we do a good job, the black box rewards us with many job applications.<\/p>\n<p class=\"wp-block-paragraph\">By putting the \u201cright\u201d jobs at the top of the page (loaded word there), we can improve upon a chronological sort. Before we can identify the right jobs, we need to know how much Google actually rewards higher-placed jobs.<\/p>\n<h2 class=\"wp-block-heading\">Day 0: Making progress when you know nothing<\/h2>\n<p class=\"wp-block-paragraph\">Sometimes, just to justify all the simplifying assumptions I\u2019m going to make later, I start a project by writing down the math equation I\u2019d <em>lik<\/em>e to solve. I imagine ours looks something like this:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"f2f2f2\" data-has-transparency=\"false\" style=\"--dominant-color: #f2f2f2;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"105\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/unnamed-64-1024x105.png?resize=1024%2C105&#038;ssl=1\" alt=\"\" class=\"wp-image-599420 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/unnamed-64-1024x105.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/unnamed-64-300x31.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/unnamed-64-768x78.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/unnamed-64.png 1273w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><\/figure>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\n<strong><em>S<\/em><\/strong> is our vector of relevancy scores. There are N jobs, so each <em>s_i (<\/em>an element of S) corresponds to a different job. A function called <em>applies<\/em> turns <strong>S<\/strong> into a scalar. Each day we\u2019d like to find the <strong>S<\/strong> that makes that number as large as possible \u2014 the relevancy scores that generate the greatest number of job applications for intelycare.com\/jobs.<\/li>\n<li class=\"wp-block-list-item\">\n<em>applies<\/em> is a fine objective function on Day 0. Later on our objective function could change (e.g. revenue, lifetime value). Applies are easy to count, though, and lets me spend my<a href=\"https:\/\/mcfunley.com\/choose-boring-technology\"> complexity tokens<\/a> elsewhere. It\u2019s Day 0, people. We\u2019ll come back to these questions on Day 1.<\/li>\n<li class=\"wp-block-list-item\">Problem. We know nothing about the<em> applies<\/em> function until we start feeding it relevancy scores. <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f631.png?ssl=1\" alt=\"\ud83d\ude31\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\">\n<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\"><strong>First things first: <\/strong>Seeing that we know nothing about the <em>applies<\/em> function, our first question is, \u201chow do we choose an ongoing wave of daily S vectors <em>so we can learn<\/em> what the <em>applies<\/em> function looks like?\u201d<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">We know (1) which jobs are boosted and when, (2) how many applies each job receives each day. Note the absence of page-load data. It\u2019s Day 0! You might not have all the data you want on Day 0, but if we\u2019re clever, we can make do with what we have.<\/li>\n<li class=\"wp-block-list-item\">Note the subtle change in our objective. Earlier, our goal was to accomplish some business objective (maximize applies), and eventually, we\u2019ll come back to that goal. We took off our business hat for a minute and put on our science hat. Our only goal now is to learn something. If we can learn something, we can use it (later) to help achieve some business objective.<img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f913.png?ssl=1\" alt=\"\ud83e\udd13\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\">\n<\/li>\n<li class=\"wp-block-list-item\">Since our goal is to learn <em>something, <\/em>above all we want to avoid learning <em>nothing<\/em>. Remember it\u2019s Day 0 and we have no guarantee that the Google Monster will pay any attention to how we sort things. We may as well go for broke and make sure this thing works before throwing more time at improving it.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\"><strong>How do we choose an initial wave of daily S vectors? <\/strong>We\u2019ll give every job a score of 0 (default score), and choose a <em>random<\/em> subset of jobs to boost to 100.<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Maybe I\u2019m stating the obvious, but it has to be random if you want to isolate the effect of page-position on job applications. We want the only difference between boosted jobs and other jobs to be their relative ordering on the page as determined by our relevance scores. [I can\u2019t tell you how many phone screens I\u2019ve conducted where a candidate doubled down on running an A\/B test with the good customers in one group and the bad customers in the other group. In fairness, I\u2019ve also vetted marketing-tech vendors who do the same thing <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f62d.png?ssl=1\" alt=\"\ud83d\ude2d\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\">].<\/li>\n<li class=\"wp-block-list-item\">The randomness will be nice later on for other reasons. It\u2019s likely that some jobs benefit from page-placement more than others. We\u2019ll have an easier time identifying those jobs with a big, randomly-generated dataset.<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\">The plan: Subtle but important details<\/h2>\n<p class=\"wp-block-paragraph\">We know we can\u2019t boost <em>every<\/em> job. Anytime I put a job at the top of the page, I bump all other jobs down the page (classic example of a \u201c<a href=\"https:\/\/en.wikipedia.org\/wiki\/Spillover_(experiment)\">spillover<\/a>\u201d).<\/p>\n<p class=\"wp-block-paragraph\">The spillover gets worse as I boost more and more jobs, I impose a greater and greater punishment on all other jobs by pushing them down in the sort (including other boosted jobs).<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">With little exception, nursing jobs are in-person and local, so any boosting spillovers will be limited to other nearby jobs. This is important.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\"><strong>How do we choose an initial wave of daily S vectors? (final answer) <\/strong>We\u2019ll give every job a score of 0 (default score), and choose a random subset of jobs to boost to 100. <em>The size of the random subset will vary across geographies.<\/em><\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">We create 4 groups of distinct geographies with roughly the same amount of web traffic in each group. Each group is balanced along the key dimensions we think are important. We randomly boost a different percentage of jobs in each group.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">Here\u2019s how it looked\u2026<\/p>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" data-dominant-color=\"fafafa\" data-has-transparency=\"false\" style=\"--dominant-color: #fafafa;\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"560\" src=\"https:\/\/i0.wp.com\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-4-1024x560.png?resize=1024%2C560&#038;ssl=1\" alt=\"\" class=\"wp-image-599412 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-4-1024x560.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-4-300x164.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-4-768x420.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-4-1536x840.png 1536w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/03\/Experiments-4.png 1600w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\"><figcaption class=\"wp-element-caption\"><em>Daily Applies for boosted vs unboosted Jobs. Note how boosted jobs do better when there are fewer of them. (Image by author)<\/em><\/figcaption><\/figure>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Each black circle represents a different geography. Its elevation shows the difference in applies-per-job between boosted jobs and all other jobs (measured as a percent).<\/li>\n<li class=\"wp-block-list-item\">While groups are balanced in aggregate, the individual geographies vary considerably. The balance is still important though. Otherwise, what you see in the chart could be an artifact of the mix of urban\/rural or large\/small geographies in each group. As it is, we\u2019re confident the results come from our relevancy scores.<\/li>\n<li class=\"wp-block-list-item\">A quick-and-dirty interpretation of this chart is something like, \u201cthe 5% of jobs at the top of the page have ~26% more applies per day than the 95% of jobs placed below. The 10% of jobs at the top of the page have ~21% more applies per day than the 90% of jobs underneath\u2026\u201d and so on. I would never be so bold as to say that in real life, but in a perfect-experiment world it would be true.<\/li>\n<li class=\"wp-block-list-item\">By the time we boost 25% of jobs, the boost experience is entirely averaged out! We diluted the perks of premium placement to practically nothing for the median geography.<a href=\"https:\/\/www.youtube.com\/watch?v=fmSO2cz2ozQ\"> \u201cAnd when everyone is super, no one will be! &lt;evil laugh&gt;.\u201d<\/a> Can you imagine learning this the hard way?<\/li>\n<li class=\"wp-block-list-item\">There are many other layers to peel back. Perhaps dilution happens more quickly for nursing specialties with many pages of listings? What about states that overlap with our long-standing per-diem staffing business? Many fine questions, we have answers for some, but all more than I can include in this post.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\"><strong>What comes next? <\/strong>Day 1 is when the real fun begins! <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f389.png?ssl=1\" alt=\"\ud83c\udf89\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"><\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">We now have guardrails against diluting our premium experience (super important), but what is the best ~10% of jobs to boost each day? Obviously our paying customers have priority, but then what?<\/li>\n<li class=\"wp-block-list-item\">Does boost help some jobs more than others? The randomly-generated data from our experiment is well suited to answer this and many other questions. We\u2019ll save those questions for future posts.<\/li>\n<li class=\"wp-block-list-item\">Once we have a strategy for boosting, is our objective really to maximize the <em>total<\/em> number of applies? Or do we only care about the applies for boosted jobs? <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f914.png?ssl=1\" alt=\"\ud83e\udd14\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"> (Sometimes I miss the Day 0 days when all the jobs were equally relevant. Might be time to revisit those equations at the top of the post.)<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\">Key takeaways for those who made it this far<\/h2>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">By being thoughtful about how we generated our initial data, we quickly found a convincing answer to our question, set ourselves up to answer many future questions, and saved ourselves a ton of time trying to build an uplift model on non-existent historical data.<\/li>\n<li class=\"wp-block-list-item\">Thinking of a test? Go for it! If you execute well, you can see the results clearly in a chart and avoid all the complicated statistics (<a href=\"https:\/\/www.explainxkcd.com\/wiki\/index.php\/2400:_Statistics\">obligatory xkcd<\/a> reference). [hmm, maybe *most* of the statistics. I still love a good regression table.]<\/li>\n<li class=\"wp-block-list-item\">Spillovers are everywhere. Sometimes varying the treatment across an aggregated group can help like it did here. That can quickly axe your sample-size, but I find it better to have a small data set with meaning than a big data set that\u2019s hot garbage.<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\">Bonus: We ran this experiment in 2023. How are things now?<\/h2>\n<p class=\"wp-block-paragraph\">At the time of our little geo-randomized experiment, you see in the charts that our premium job openings performed ~25% better than regular jobs (meaning they had 25% more applies on average).<\/p>\n<p class=\"wp-block-paragraph\"><strong>Why it matters: <\/strong>We\u2019ve taken over a year to grow and iterate our product to ensure our premium listings deliver the best possible experience. Looking at some recent numbers\u2026 (literally running the queries as I write this)<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Boosted job openings receive <strong>425% more applies <\/strong>than regular openings<\/li>\n<li class=\"wp-block-list-item\">Boosted jobs are <strong>450% more likely to have receive at least one apply <\/strong>compared to regular openings<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">Not bad! This isn\u2019t randomized, so that 425% includes all sorts of selection bias, additional product work, a crack SEO team, and a successful email operation, all in addition to the incremental effects from premium page position. Importantly, all the extra product and marketing work is focused on a small number of jobs as our initial testing recommends. <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/s.w.org\/images\/core\/emoji\/15.0.3\/72x72\/1f3c6.png?ssl=1\" alt=\"\ud83c\udfc6\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\"><\/p>\n<p>The post <a href=\"https:\/\/towardsdatascience.com\/experiments-illustrated-how-we-optimized-premium-listings-on-our-nursing-job-board\/\">Experiments Illustrated: How We Optimized Premium Listings on Our Nursing Job Board<\/a> appeared first on <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a>.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Ben Tengelsen<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/experiments-illustrated-how-we-optimized-premium-listings-on-our-nursing-job-board\/\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Experiments Illustrated: How We Optimized Premium Listings on Our Nursing Job Board Running experiments is a task that often falls to data scientists. If that\u2019s you, congrats! It can be a rewarding and high-impact area of work, but also requires tools found outside the typical ML-heavy data science curriculum. Even with the best tools, only [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,83,1977,1978,229,1710,1979],"tags":[348,7,918],"class_list":["post-2329","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-data-science","category-experiment-design","category-geo-analytics","category-math","category-statisitcs","category-testing-techniques","tag-experiments","tag-how","tag-job"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/2329"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=2329"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/2329\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=2329"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=2329"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=2329"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}