{"id":1431,"date":"2025-01-25T07:02:55","date_gmt":"2025-01-25T07:02:55","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/01\/25\/deep-learning-for-click-prediction-in-mobile-adtech-9739fe3f52de\/"},"modified":"2025-01-25T07:02:55","modified_gmt":"2025-01-25T07:02:55","slug":"deep-learning-for-click-prediction-in-mobile-adtech-9739fe3f52de","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/01\/25\/deep-learning-for-click-prediction-in-mobile-adtech-9739fe3f52de\/","title":{"rendered":"Deep Learning for Click Prediction in Mobile AdTech"},"content":{"rendered":"<p>    Deep Learning for Click Prediction in Mobile AdTech<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2AK3RRU2maaQdGOIvT9tMFIw.png?ssl=1\"><figcaption>Source: <a href=\"https:\/\/pixabay.com\/illustrations\/rays-stars-light-explosion-galaxy-9350519\/\">https:\/\/pixabay.com\/illustrations\/rays-stars-light-explosion-galaxy-9350519\/<\/a><\/figcaption><\/figure>\n<h4>Machine Learning for Real-Time Bidding<\/h4>\n<p>The past few years were a revolution for the mobile advertising and gaming industries, with the broad adoption of neural networks for advertising tasks, including click prediction. This migration occurred prior to the success of Large Language Models (LLMs) and other AI innovations, but is building on the momentum of this wave. The mobile gaming industry is spending billions on User Acquisition every year, and top players in this space such as Applovin have market caps of over $100B. In this post, we\u2019ll discuss a conventional ML approach for click prediction, offer motivations for the migration to deep learning for this task, provide a hands-on example of the benefits of this approach using a data set from <a href=\"https:\/\/www.kaggle.com\/\">Kaggle<\/a>, and detail some of the enhancements that this approach provides.<\/p>\n<p>Most large tech companies in the AdTech space are likely using deep learning for predicting user behavior. Social Media platforms have embraced the migration from classic machine learning (ML) to deep learning, as indicated by this <a href=\"https:\/\/www.reddit.com\/r\/RedditEng\/comments\/16bovfo\/our_journey_to_developing_a_deep_neural_network\/\">Reddit post<\/a> and this <a href=\"https:\/\/www.linkedin.com\/blog\/engineering\/machine-learning\/challenges-and-practical-lessons-from-building-a-deep-learning-b\">LinkedIn post<\/a>. In the mobile gaming space <a href=\"https:\/\/www.moloco.com\/blog\/dr-sechan-oh-explains-molocos-machine-learning\">Moloco<\/a>, <a href=\"https:\/\/liftoff.io\/blog\/cortex-ml-platform-announcement\/\">Liftoff<\/a>, and <a href=\"https:\/\/cloud.google.com\/blog\/products\/compute\/applovin-builds-ai-ad-platform-on-google-cloud-g2-vms\">Applovin <\/a>have all shared details on their migration to deep learning or hardware acceleration to improve their user acquisition platforms. Most Demand Side Platforms (DSPs) are now looking to leverage neural networks to improve the value that their platforms provide for mobile user acquisition.<\/p>\n<p>We\u2019ll start by discussing logistic regression as an industry standard for predicting user actions, discuss some of the shortfalls of this approach, and then showcase deep learning as a solution for click prediction. We\u2019ll provide a deep dive on implementations for both a <a href=\"https:\/\/alumni.soe.ucsc.edu\/~bweber\/ClickLogit.html\">classic ML notebook<\/a> and <a href=\"https:\/\/alumni.soe.ucsc.edu\/~bweber\/ClickDNN.html\">deep learning notebook<\/a> for the task of predicting if a user is going to click on an ad. We won\u2019t dive into the state of the art, but we will highlight where deep learning provides many benefits.<\/p>\n<p>All images in this post, with the exception of the header image, were created from by the author in the notebooks linked above. The Kaggle data set that we explore in this post has the <a href=\"https:\/\/creativecommons.org\/publicdomain\/zero\/1.0\/\">CC0: Public Domain<\/a>\u00a0license.<\/p>\n<h3>Cost Per Click\u00a0Modeling<\/h3>\n<p>One of the goal types that DSPs typically provide for user acquisition is a cost per click model, where the advertiser is charged each time that the platform serves an impression on a mobile device and the user clicks. We\u2019ll focus on this goal type to keep things simple, but most advertisers prefer goal types focused on driving installs or acquiring users that will spend money in their\u00a0app.<\/p>\n<p>In programmatic bidding, a DSP is integrated with one or more ad exchanges, which provide inventory for the platform to bid on. Most exchanges use a version of the <a href=\"https:\/\/www.iab.com\/wp-content\/uploads\/2016\/03\/OpenRTB-API-Specification-Version-2-5-FINAL.pdf\">OpenRTB specification<\/a> to send bid requests to DSPs and get back responses in a standardized format. For each ad request from a Supply Side Platform (SSP), the exchange runs an auction and the DSP that responds with the highest price wins. The exchange then provides the winning bid response to the SSP, which may result in an ad impression on a mobile\u00a0device.<\/p>\n<p>In order for a DSP to integrate with an ad exchange, there is an onboarding process to make sure that the DSP can meet the technical requirements of an exchange, which typically requires DSPs to respond to bid requests within 120 milliseconds. What makes this a huge challenge is that some exchanges provide over 1 million bid requests per second, and DSPs are usually integrating with several exchanges. For example, <a href=\"https:\/\/www.moloco.com\/r-d-blog\/challenges-in-building-a-scalable-demand-side-platform-dsp-service\">Moloco responds<\/a> to over 5 million requests per second (QPS) during peak capacity. Because of the latency requirements and massive scale of requests, it\u2019s challenging to use machine learning for user acquisition within a DSP, but it\u2019s also a requirement in order to meet advertiser goals.<\/p>\n<p>In order to make money as a DSP you need to be able to deliver ad impressions that meet your advertiser goals, while also generating net revenue. To accomplish this, a DSP needs to bid less than the expected value that an impression will deliver, while also bidding high enough to exceed the bid floor of a request and win in auctions against other DSPs. A demand-side platform is billed per impression shown, which corresponds to a CPM (cost per impression) model. If the advertiser goal is a target cost per click (CPC), then the DSP needs to translate the CPC value to a CPM value for bidding. We can do this using machine learning and predicting the likelihood of a user to click on an impression, which we call <em>p_ctr<\/em>. We can this calculate a bid price as\u00a0follows:<\/p>\n<pre>cpm = target_cpc * p_ctr<br>bid_price = cpm * bid_shade<\/pre>\n<p>We use the likelihood of a click event to convert from cost per click to cost per impression and then apply a bid shade with a value of less than 1.0 to make sure that we are delivering more value for advertisers than we are paying to the ad exchange for serving the impression.<\/p>\n<p>In order for a click prediction model to perform well for programmatic user acquisition, we want a model that has the following properties:<\/p>\n<ol>\n<li>\n<strong>Large Bias<br \/><\/strong>We want a click model that is highly discriminative and able to differentiate between impressions unlikely to result in a click and ones that are highly likely to result in a click. If a model does not have sufficient bias, it won\u2019t be able to compete with other DSPs in auctions.<\/li>\n<li>\n<strong>Well Calibrated<br \/><\/strong>We want the predicted and actual conversion rates of the model to align well for the ad impressions the DSP purchases. This means we have a preference for models where the output can be interpreted as a probability of a conversion occurring. Poor calibration will result in inefficient spending. A sample calibration plot is shown\u00a0below.<\/li>\n<li>\n<strong>Fast Evaluation<br \/><\/strong>We want to reduce our compute cost when bidding on millions of requests per second and have models that are fast to inference.<\/li>\n<li>\n<strong>Parallel Evaluation<\/strong><br \/>Ideally, we want to be able to run model inference in parallel to improve throughput. For a single bid request, a DSP may be considering hundreds of campaigns to bid for, and each one needs a <em>p_ctr<\/em>\u00a0value.<\/li>\n<\/ol>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/966\/1%2AbeCmyaZdfGTMVz2GAY92Xw.png?ssl=1\"><figcaption>A model calibration plot (Created by the author in the <a href=\"https:\/\/alumni.soe.ucsc.edu\/~bweber\/ClickLogit.html\">ClickLogit Notebook<\/a>)<\/figcaption><\/figure>\n<p>Many ad tech platforms started with logistic regression for click prediction, because they work well for the first 3 desired properties. Over time, it was discovered that deep learning models could perform better than logistic regression on the bias goal, with neural networks being better at discriminating between click and no-click impressions. Additionally, neural networks can use batch evaluation and align will with the fourth property of parallel evaluation.<\/p>\n<p>DSPs were able to push logistic regression models pretty far, which is what we\u2019ll cover in the next section, but they do have some boundaries in their application to user acquisition. Deep neural networks (DNN) can overcome some of these issues, but present new challenges of their\u00a0own.<\/p>\n<h3>The Big Logistic\u00a0Era<\/h3>\n<p>Ad Tech companies have been using logistic regression for more than a decade for click prediction. For example, Facebook presented using logit in combination with other models at <a href=\"https:\/\/research.facebook.com\/publications\/practical-lessons-from-predicting-clicks-on-ads-at-facebook\/\">ADKDD 2014<\/a>. There are many different ways of using logistic regression for click prediction, but I\u2019ll focus on a single approach I worked on in the past called Big Logistic. The general idea was to turn all of your features into tokens, create combinations of tokens to represent crosses or feature interactions, and then create a list of tokens that you use to convert your input features into a sparse vector representation. It\u2019s an approach where every feature is 1-hot encoded and all of the features are binary, which helps simplify hyperparameter tuning for the click model. It\u2019s an approach that can support numeric, categorical, and many-hot features as\u00a0inputs.<\/p>\n<p>To determine what this approach looks like in practice, we\u2019ll provide a hands-on example of training a click prediction model using the <a href=\"https:\/\/www.kaggle.com\/datasets\/arashnic\/ctr-in-advertisement\">CTR In Advertisement<\/a> Kaggle data set. The full notebook for feature encoding, model training and evaluation is <a href=\"https:\/\/alumni.soe.ucsc.edu\/~bweber\/ClickLogit.html\">available here<\/a>. I used Databricks, PySpark, and MLlib for this pipeline.<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2A8ydYYLsyCeM_zsM66ZP5EQ.png?ssl=1\"><figcaption>Sample Data from the Kaggle Training Data\u00a0Set<\/figcaption><\/figure>\n<p>The dataset provides a training data set with labels and a test data set without labels. For this exercise we\u2019ll split the training file into train and test groups, so that we have labels available for all records. We create a 90\/10% split where the train set has 414k records and test has 46k records. The data set has 15 columns, which includes a label, 2 columns that we\u2019ll ignore (session_id and user_id) and 12 categorical values that we\u2019ll use as features in our model. A few sample records are shown in the table\u00a0above.<\/p>\n<p>The first step we\u2019ll perform is tokenizing the data set, which is a form of 1-hot encoding. We convert each column to a string value by concatenating the feature name and feature value. For example, we would create the following tokens for the first row in the above\u00a0table:<\/p>\n<p><em>[\u201cproduct_c\u201d, \u201ccampaign_id_359520\u201d, \u201cwebpage_id_13787\u201d,\u00a0..]<\/em><\/p>\n<p>For null values, we use \u201cnull\u201d as the value, e.g. \u201cproduct_null\u201d. We also create all combinations of two features, which generates additional tokens:<\/p>\n<p><em>[\u201cproduct_c*campaign_id_359520\u201d, \u201c\u201d, \u201cproduct_c*webpage_id_13787\u201d, \u201ccampaign_id_359520*webpage_id_13787\u201d,..]<\/em><\/p>\n<p>We use a UDF on the PySpark dataframe to convert the 12 columns into a vector of strings. The resulting dataframe includes the token list and label, as shown\u00a0below.<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2ADrOviK1E8E3taYh68Hhwtg.png?ssl=1\"><figcaption>The Tokenized Data\u00a0Set<\/figcaption><\/figure>\n<p>We then create a top tokens list, assign an index to each token in this list, and use the mapping of token name to token index to encode the data. We limited our token list to values where we have at least 1000 examples, which resulted in roughly 2,500\u00a0tokens.<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/522\/1%2ASC-Q8P6IjpLrO3XaXrCojA.png?ssl=1\"><\/figure>\n<p>We then apply this token list to each record in the data set to convert from the token list to a sparse vector representation. If a record includes the token for an index, the value is set to 1, and if the token is missing the value is set to 0. This results in a data set that we can use with MLlib to train a logistic regression model.<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1%2A7SzMSdMX0UN4nDzvenM68A.png?ssl=1\"><figcaption>The encoded data set, ready for\u00a0MLlib<\/figcaption><\/figure>\n<p>We split the dataset into train and test groups, fit the model on the train data set, and then transform the test data set to get predictions.<\/p>\n<pre>classifier = LogisticRegression(featuresCol = 'features',<br>    labelCol = 'label', maxIter = 50, regParam = 0.01, elasticNetParam = 0)<br>lr_model = classifier.fit(train_df)<br>pred_df = lr_model.transform(test_df).cache()<\/pre>\n<p>This process resulted in the following offline metrics, which we\u2019ll compare to a deep learning model in the next\u00a0section.<\/p>\n<pre>Actual Conv: 0.06890<br>Predicted Conv: 0.06770<br>Log Loss: 0.24795<br>ROC AUC: 0.58808<br>PR AUC: 0.09054<\/pre>\n<p>The AUC metrics don\u2019t look great, but there isn\u2019t much signal in the data set with the features that we explored, and other participants in the Kaggle competition generally had lower ROC metrics. One other limitation of the data set is that the categorical values are low cardinality, with only a small number of distinct values. This resulted in a low parameter count, with only 2,500 features, which limited the bias of the\u00a0model.<\/p>\n<p>Logistic regression works great for click prediction, but where we run into challenges is when dealing with high cardinality features. In mobile ad tech, the publisher app, where the ad is rendered, is a high cardinality feature, because there are millions of potential mobile apps that may render an ad. If we want to include the publisher app as a feature in our model, and are using 1-hot encoding, we are going to end up with a large parameter count. This is especially the case when we perform feature crosses between the publisher app and other high cardinality features, such as the device\u00a0model.<\/p>\n<p>I\u2019ve worked with logistic regression click models that have more than 50 million parameters. At this scale, MLlib\u2019s implementation of logistic regression runs into training issues, because it densifies the vectors in its training loop. To avoid this bottleneck, I used the <a href=\"https:\/\/github.com\/TalkingData\/Fregata\">Fregata library<\/a>, which performs gradient descent using the sparse vector directly in a model averaging strategy.<\/p>\n<p>The other issue with large click models is model inference. If you include too many parameters in your logit model, it may be slow to evaluate, significantly increasing your model serving\u00a0costs.<\/p>\n<h3>The Deep Learning\u00a0Era<\/h3>\n<p>Deep learning is a good solution for click models, because it provides methods for working efficiently with very sparse features with high cardinality. One of the key layers that we\u2019ll use in our deep learning model is an embedding layer, which takes a categorical feature as an input and a dense vector as an output. With an embedding layer, we learn a vector for each of the entries in our vocabulary for a categorical feature, and the number of parameters is the size of the vocabulary times the output dense vector size, which we can control. Neural networks can reduce the parameter count by creating interactions between the dense layers output of embeddings, rather than making crosses between the sparse 1-hot encoded approach used in logistic regression.<\/p>\n<p>Embedding layers are just one way that neural networks can provide improvements over logistic regression models, because deep learning frameworks provide a variety of layer types and architectures. We\u2019ll focus on embeddings for our sample model to keep things simplistic. We\u2019ll create a pipeline for encoding the data set into TensorFlow Records and then train a model using embeddings and cross layers to perform click prediction. The full notebook for data preparation, model training and evaluation is <a href=\"https:\/\/alumni.soe.ucsc.edu\/~bweber\/ClickDNN.html\">available here<\/a>.<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/596\/1%2AZPo6bKjg3yG6q6TH0WAZNg.png?ssl=1\"><figcaption>Vocabulary for the Product\u00a0Feature<\/figcaption><\/figure>\n<p>The first step that we perform is generating a vocabulary for each of the features that we want to encode. For each feature, we find all values with more than 100 instances, and everything else is grouped into an out-of-vocab (OOV) value. We then encode all of the categorical features and combine them into a single tensor named <em>int<\/em>, as shown\u00a0below.<\/p>\n<figure><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/534\/1%2ASWCU68o2Dj_aU7AIib5ASw.png?ssl=1\"><figcaption>Features Reshaped into a\u00a0Tensor<\/figcaption><\/figure>\n<p>We then save the Spark dataframe as TensorFlow records to cloud\u00a0storage.<\/p>\n<pre>output_path = \"dbfs:\/mnt\/ben\/kaggle\/train\/\"<br>train_df.write.format(\"tfrecords\").mode(\"overwrite\").save(output_path)<\/pre>\n<p>We then copy the files to the driver node and create TensorFlow data sets for training and evaluating the\u00a0model.<\/p>\n<pre>def getRecords(paths):<br>    features = {<br>        'int': FixedLenFeature([len(vocab_sizes)], tf.int64),<br>        'label': FixedLenFeature([1], tf.int64)<br>    }<br> <br>    @tf.function<br>    def _parse_example(x):<br>        f = tf.io.parse_example(x, features)<br>        return f, f.pop(\"label\")<br> <br>    dataset = tf.data.TFRecordDataset(paths)<br>    dataset = dataset.batch(10000)<br>    dataset = dataset.map(_parse_example)<br>    return dataset<br><br>training_data = getRecords(train_paths)<br>test_data = getRecords(test_paths)<\/pre>\n<p>We then create a Keras model, where the input layer is an embedding layer per categorical feature, we have two hidden cross layers, and a final output layer that is a sigmoid activation for the propensity prediction.<\/p>\n<pre>cat_input = tf.keras.Input(shape=(len(vocab_sizes)),<br>    name = \"int\", dtype='int64')<br>input_layers = [cat_input]<br><br>cross_inputs = []<br>for attribute in categories_index:<br>    index = categories_index[attribute]<br>    size = vocab_sizes[attribute]<br><br>    category_input = cat_input[:,(index):(index+1)]<br>    embedding = keras.layers.Flatten()<br>        (keras.layers.Embedding(size, 5)(category_input))<br>    cross_inputs.append(embedding)<br><br>cross_input = keras.layers.Concatenate()(cross_inputs)<br>cross_layer = tfrs.layers.dcn.Cross()<br>crossed_ouput = cross_layer(cross_input, cross_input)<br><br>cross_layer = tfrs.layers.dcn.Cross()<br>crossed_ouput = cross_layer(cross_input, crossed_ouput)<br><br>sigmoid_output=tf.keras.layers.Dense(1,activation=\"sigmoid\")(crossed_ouput)<br>model = tf.keras.Model(inputs=input_layers, outputs = [ sigmoid_output ])<br>model.summary()<\/pre>\n<p>The resulting model has 7,951 parameters, which is about 3 times the size of our logistic regression model. If the categories had larger cardinalities, then we would expect the parameter count of the logit model to be higher. We train the model for 40\u00a0epochs:<\/p>\n<pre>metrics=[tf.keras.metrics.AUC(), tf.keras.metrics.AUC(curve=\"PR\")]<br><br>model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),<br>    loss=tf.keras.losses.BinaryCrossentropy(), metrics=metrics)<br>history = model.fit(x = training_data, epochs = 40,<br>    validation_data = test_data, verbose=0)<\/pre>\n<p>We can now compare the offline metrics between our logistic regression and DNN\u00a0models:<\/p>\n<pre>                Logit    DNN <br>Actual Conv:    0.06890  0.06890<br>Predicted Conv: 0.06770  0.06574<br>Log Loss:       0.24795  0.24758<br>ROC AUC:        0.58808  0.59284<br>PR AUC:         0.09054  0.09249<br><\/pre>\n<p>We do see improvements to the log loss metric where lower is better and the AUC metrics where higher is better. The main improvement is to the precision-recall (PR) AUC metric, which may help the model perform better in auctions. One of the issues with the DNN model is that the model calibration is worse, and the DNN average predicted value is further off than the logistic regression model. We would need to do a bit more model tuning to improve the calibration of the\u00a0model.<\/p>\n<h3><strong>What\u2019s Next?<\/strong><\/h3>\n<p>We are now in the era of deep learning for ad tech and companies are using a variety of architectures to deliver advertiser goals for user acquisition. In this post, we showed how migrating from logistic regression to a simple neural network with embedding layers can provide better offline metrics for a click prediction model. Here are some additional ways we could leverage deep learning to improve click prediction:<\/p>\n<ol>\n<li>\n<strong>Use Embeddings from Pre-trained Models<br \/><\/strong>We can use models such as <a href=\"https:\/\/en.wikipedia.org\/wiki\/BERT_(language_model)\">BERT<\/a> to convert app store descriptions into vectors that we can use as input to the click\u00a0model.<\/li>\n<li>\n<strong>Explore New Architectures <br \/><\/strong>We could explore the <a href=\"https:\/\/arxiv.org\/abs\/1708.05123\">DCN<\/a> and <a href=\"https:\/\/arxiv.org\/abs\/2012.06678\">TabTransformer<\/a> architectures.<\/li>\n<li>\n<strong>Add Non-Tabular Data<br \/><\/strong>We could use <a href=\"https:\/\/github.com\/christiansafka\/img2vec\">img2vec<\/a> to create input embeddings from creative\u00a0assets.<\/li>\n<\/ol>\n<p>Thanks for\u00a0reading!<\/p>\n<p>Ben Weber is a machine learning engineer with over a decade of experience in gaming and ad tech with prior roles at Zynga, Microsoft, Amazon, and Electronic Arts.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/medium.com\/_\/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=9739fe3f52de\" width=\"1\" height=\"1\" alt=\"\"><\/p>\n<hr>\n<p><a href=\"https:\/\/towardsdatascience.com\/deep-learning-for-click-prediction-in-mobile-adtech-9739fe3f52de\">Deep Learning for Click Prediction in Mobile AdTech<\/a> was originally published in <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a> on Medium, where people are continuing the conversation by highlighting and responding to this story.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Ben Weber<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/medium.com\/m\/global-identity-2?redirectUrl=https%3A%2F%2Ftowardsdatascience.com%2Fdeep-learning-for-click-prediction-in-mobile-adtech-9739fe3f52de\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Deep Learning for Click Prediction in Mobile AdTech Source: https:\/\/pixabay.com\/illustrations\/rays-stars-light-explosion-galaxy-9350519\/ Machine Learning for Real-Time Bidding The past few years were a revolution for the mobile advertising and gaming industries, with the broad adoption of neural networks for advertising tasks, including click prediction. This migration occurred prior to the success of Large Language Models (LLMs) and [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1466,62,83,166,70,1467],"tags":[1468,4,199],"class_list":["post-1431","post","type-post","status-publish","format-standard","hentry","category-advertising","category-aimldsaimlds","category-data-science","category-hands-on-tutorials","category-machine-learning","category-mobile-app-development","tag-click","tag-deep","tag-learning"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/1431"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=1431"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/1431\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=1431"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=1431"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=1431"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}