{"id":1848,"date":"2025-02-14T07:03:10","date_gmt":"2025-02-14T07:03:10","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/02\/14\/learnings-from-a-machine-learning-engineer-part-4-the-model\/"},"modified":"2025-02-14T07:03:10","modified_gmt":"2025-02-14T07:03:10","slug":"learnings-from-a-machine-learning-engineer-part-4-the-model","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/02\/14\/learnings-from-a-machine-learning-engineer-part-4-the-model\/","title":{"rendered":"Learnings from a Machine Learning Engineer \u2014 Part 4: The Model"},"content":{"rendered":"<p>    Learnings from a Machine Learning Engineer \u2014 Part 4: The Model<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p class=\"wp-block-paragraph\" id=\"84c6\">In this latest part of my series, I will share what I have learned on selecting a model for <a href=\"https:\/\/towardsdatascience.com\/tag\/image-classification\/\" title=\"Image Classification\">Image Classification<\/a> and how to fine tune that model. I will also show how you can leverage the model to accelerate your labelling process, and finally how to justify your efforts by generating usage and performance statistics.<\/p>\n<p class=\"wp-block-paragraph\" id=\"f55e\">In\u00a0<a href=\"https:\/\/towardsdatascience.com\/learnings-from-a-machine-learning-engineer-part-1-the-data\/\">Part 1<\/a>, I discussed the process of labelling your image data that you use in your image classification project. I showed how define \u201cgood\u201d images and create sub-classes. In\u00a0<a href=\"https:\/\/towardsdatascience.com\/learnings-from-a-machine-learning-engineer-part-2-the-data-sets\/\">Part 2<\/a>, I went over various data sets, beyond the usual train-validation-test sets, with benchmark sets, plus how to handle synthetic data and duplicate images. In <a href=\"https:\/\/towardsdatascience.com\/learnings-from-a-machine-learning-engineer-part-3-the-evaluation\/\">Part 3<\/a>, I explained how to apply different evaluation criteria to a trained model versus a deployed model, and using benchmarks to determine when to deploy a model.<\/p>\n<h2 class=\"wp-block-heading\" id=\"e364\"><strong>Model selection<\/strong><\/h2>\n<p class=\"wp-block-paragraph\" id=\"b114\">So far I have focused a lot of time on labelling and curating the set of images, and also evaluating model performance, which is like putting the cart before the horse. I\u2019m not trying to minimize what it takes to design a massive neural network \u2014 this is a very important part of the application you are building. In my case, I spent a few weeks experimenting with different available models before settling on one that fit the bill.<\/p>\n<p class=\"wp-block-paragraph\" id=\"acaa\">Once you pick a model structure, you usually don\u2019t make any major changes to it. For me, six years into deployment, I\u2019m still using the same one. Specifically, I chose Inception V4 because it has a large input image size and an adequate number of layers to pick up on subtle image features. It also performs inference fast enough on CPU, so I don\u2019t need to run expensive hardware to serve the model.<\/p>\n<p class=\"wp-block-paragraph\" id=\"a6ba\">Your mileage may vary. But again, the main takeaway is that focusing on your data will pay dividends versus searching for the best model.<\/p>\n<h2 class=\"wp-block-heading\" id=\"518e\"><strong>Fine tuning<\/strong><\/h2>\n<p class=\"wp-block-paragraph\" id=\"dffd\">I will share a process that I found to work extremely well. Once I decided on the model to use, I randomly initialized the weights and let the model train for about 120 epoch before improvements plateau at a fairly modest accuracy, like 93%. At this point, I performed the evaluation of the trained model (see <a href=\"https:\/\/towardsdatascience.com\/learnings-from-a-machine-learning-engineer-part-3-the-evaluation\/\">Part 3<\/a>) to clean up the data set. I also incorporated new images as part of the data pipeline (see <a href=\"https:\/\/towardsdatascience.com\/learnings-from-a-machine-learning-engineer-part-1-the-data\/\">Part 1<\/a>) and prepared the data sets for the next training run.<\/p>\n<p class=\"wp-block-paragraph\" id=\"5c56\">Before starting the next training run, I simply take the last trained model, pop the output layer, and add it back in with random weights. Since the number of output classes are constantly increasing in my case, I have to pop that layer anyway to account for the new number of classes. Importantly, I leave the rest of the trained weights as they were and allow them to continue updating for the new classes.<\/p>\n<p class=\"wp-block-paragraph\" id=\"714c\">This allows the model to train much faster before improvements stall. After repeating this process dozens of times, the training reaches plateau after about 20 epochs, and the test accuracy can reach 99%! The model is building upon the low-level features that it established from the previous runs while re-learning the output weights to prevent overfitting.<\/p>\n<p class=\"wp-block-paragraph\" id=\"8882\">It took me a while to trust this process, and for a few years I would train from scratch every time. But after I attempted this and saw the training time (not to mention the cost of cloud GPU) go down while the accuracy continued to go up, I started to embrace the process. More importantly, I continue to see the evaluation metrics of the deployed model return solid performances.<\/p>\n<h2 class=\"wp-block-heading\" id=\"1134\"><strong>Augmentation<\/strong><\/h2>\n<p class=\"wp-block-paragraph\" id=\"ef83\">During training, you can apply transformations on your images (called \u201caugmentation\u201d) to give you more diversity from you data set. With our zoo animals, it is fairly safe to apply left-right flop, slight rotations clockwise and counterclockwise, and slight resize that will zoom in and out.<\/p>\n<p class=\"wp-block-paragraph\" id=\"c0f1\">With these transformations in mind, make sure your images are still able to act as good training images. In other words, an image where the subject is already small will be even smaller with a zoom out, so you probably want to discard the original. Also, some of your original pictures may need to be re-oriented by 90 degrees to be upright since a further rotation would make them look unusual.<\/p>\n<h2 class=\"wp-block-heading\" id=\"0777\"><strong>Bulk identification<\/strong><\/h2>\n<p class=\"wp-block-paragraph\" id=\"50dc\">As I mentioned in <a href=\"https:\/\/towardsdatascience.com\/learnings-from-a-machine-learning-engineer-part-1-the-data\/\">Part 1<\/a>, you can use the trained model to assist you in labelling images one at a time. But the way to take this even further is to have your newly trained model identify hundreds at a time while building a list of the results that you can then filter.<\/p>\n<p class=\"wp-block-paragraph\" id=\"b212\">Typically, we have large collections of\u00a0<strong>unlabelled<\/strong>\u00a0images that have come in either through regular usage of the application or some other means. Recall from <a href=\"https:\/\/towardsdatascience.com\/learnings-from-a-machine-learning-engineer-part-1-the-data\/\">Part 1<\/a> assigning \u201cunknown\u201d labels to interesting pictures but you have no clue what it is. By using the bulk identification method, we can sift through the collections quickly to target the labelling once we know what they are.<\/p>\n<p class=\"wp-block-paragraph\" id=\"73a7\">By combining your current image counts with the bulk identification results, you can target classes that need expanded coverage. Here are a few ways you can leverage bulk identification:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\n<strong>Increase low image counts<\/strong>\u00a0\u2014 Some of your classes may have just barely made the cutoff to be included in the training set, which means you need more examples to improve coverage. Filter for images that have low counts.<\/li>\n<li class=\"wp-block-list-item\">\n<strong>Replace staged or synthetic images<\/strong>\u00a0\u2014 Some classes may be built entirely using non-real-world images. These pictures may be good enough to get started with, but may cause performance issues down the road because they look different than what typically comes through. Filter for classes that depend on staged images.<\/li>\n<li class=\"wp-block-list-item\">\n<strong>Find look-alike classes<\/strong>\u00a0\u2014 A class in your data set may look like another one. For example, let\u2019s say your model can identify an antelope, and that looks like a gazelle which your model cannot identify yet. Setting a filter for antelope and a lower confidence score may reveal gazelle images that you can label.<\/li>\n<li class=\"wp-block-list-item\">\n<strong>Unknown labels<\/strong>\u00a0\u2014 You may not have known how to identify the dozens of cute wallaby pictures, so you saved them under \u201cUnknown\u201d because it was a good image. Now that you know what it is, you can filter for its look-alike kangaroo and quickly add a new class.<\/li>\n<li class=\"wp-block-list-item\">\n<strong>Mass removal of low scores<\/strong>\u00a0\u2014 As a way to clean out your large collection of unlabelled images that have nothing worth labelling, set a filter for lowest scores.<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\" id=\"33da\"><strong>Throw-away training run<\/strong><\/h2>\n<p class=\"wp-block-paragraph\" id=\"3169\">Recall the decision I made to have image cutoffs from <a href=\"https:\/\/towardsdatascience.com\/learnings-from-a-machine-learning-engineer-part-2-the-data-sets\/\">Part 2<\/a>, which allows us to ensure an adequate number of example images of a class before we train and server a model to the public. The problem is that you may have a number of classes that are\u00a0<strong>just<\/strong>\u00a0below your cutoff (in my case, 40) and don\u2019t make it into the model.<\/p>\n<p class=\"wp-block-paragraph\" id=\"fcd8\">The way I approach this is with a \u201cthrow-away\u201d training run that I do not intend to move to production. I will decrease the lower cutoff from 40 to perhaps 35, build my train-validation-test sets, then train and evaluate like I normally do. The most important part of this is the bulk identification at the end!<\/p>\n<p class=\"wp-block-paragraph\" id=\"cc11\">There is a chance that somewhere in the large collection of unlabelled images I will find the few that I need. Doing the bulk identification with this throw-away model helps find them.<\/p>\n<h2 class=\"wp-block-heading\" id=\"997a\"><strong>Performance Reporting<\/strong><\/h2>\n<p class=\"wp-block-paragraph\" id=\"9d88\">One very important aspect of any machine learning application is being able to show usage and performance reports. Your manager will likely want to see how many times the application is being used to justify the expense, and you as the ML engineer will want to see how the latest model is performing compared to the previous one.<\/p>\n<p class=\"wp-block-paragraph\" id=\"a54c\">You should build logging into your model serving to record every transaction going through the system. Also, the manual evaluations from <a href=\"https:\/\/towardsdatascience.com\/learnings-from-a-machine-learning-engineer-part-3-the-evaluation\/\">Part 3<\/a> should be recorded so you can report on performance for such things as accuracy over time, by model version, by confidence scores, by class, etc. You will be able to detect trends and make adjustments to improve the overall solution.<\/p>\n<p class=\"wp-block-paragraph\" id=\"014a\">There are a lot of reporting tools, so I won\u2019t recommend one over the other. Just make sure you are collecting as much information as you can to build these dashboards. This will justify the time, effort, and cost associated with maintaining the application.<\/p>\n<h2 class=\"wp-block-heading\" id=\"1572\"><strong>Conclusion<\/strong><\/h2>\n<p class=\"wp-block-paragraph\" id=\"7507\">We covered a lot of ground across this four-part series on building an image classification project and deploying it in the real world. It all starts with the data, and by investing the time and effort into maintaining the highest quality image library, you can reach impressive levels of model performance that will gain the trust and confidence of your business partners.<\/p>\n<p class=\"wp-block-paragraph\" id=\"a56c\">As a <a href=\"https:\/\/towardsdatascience.com\/tag\/machine-learning-engineer\/\" title=\"Machine Learning Engineer\">Machine Learning Engineer<\/a>, you are primarily responsible for building and deploying your model. But it doesn\u2019t stop there \u2014 dive into the data. The more familiar you are with the data, the better you will understand the strengths and weaknesses of your model. Take a close look at the evaluations and use them as an opportunity to adjust the data set.<\/p>\n<p class=\"wp-block-paragraph\" id=\"e6e9\">I hope these articles have helped you find new ways to improve your own machine learning project. And by the way, don\u2019t let the machine do all the learning \u2014 as humans, our job is to continue our own learning, so don\u2019t ever stop!<\/p>\n<p class=\"wp-block-paragraph\" id=\"013a\">Thank you for taking this deep dive with me into a data-driven approach to model optimization. I look forward to your feedback and how you can apply this to your own application.<a href=\"https:\/\/medium.com\/tag\/machine-learning?source=post_page-----7f530bc91383---------------------------------------\"><\/a><\/p>\n<p>The post <a href=\"https:\/\/towardsdatascience.com\/learnings-from-a-machine-learning-engineer-part-4-the-model\/\">Learnings from a Machine Learning Engineer \u2014 Part 4: The Model<\/a> appeared first on <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a>.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    David Martin<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/learnings-from-a-machine-learning-engineer-part-4-the-model\/\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Learnings from a Machine Learning Engineer \u2014 Part 4: The Model In this latest part of my series, I will share what I have learned on selecting a model for Image Classification and how to fine tune that model. I will also show how you can leverage the model to accelerate your labelling process, and [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1738,62,1322,70,909,1739,1498],"tags":[7,103,1740],"class_list":["post-1848","post","type-post","status-publish","format-standard","hentry","category-ai-model","category-aimldsaimlds","category-image-classification","category-machine-learning","category-machine-learning-engineer","category-model-deployment","category-model-training","tag-how","tag-model","tag-part"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/1848"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=1848"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/1848\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=1848"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=1848"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=1848"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}