{"id":3480,"date":"2025-05-01T07:02:26","date_gmt":"2025-05-01T07:02:26","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/05\/01\/why-are-convolutional-neural-networks-great-for-images\/"},"modified":"2025-05-01T07:02:26","modified_gmt":"2025-05-01T07:02:26","slug":"why-are-convolutional-neural-networks-great-for-images","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/05\/01\/why-are-convolutional-neural-networks-great-for-images\/","title":{"rendered":"Why Are Convolutional Neural Networks Great For\u00a0Images?"},"content":{"rendered":"<p>    Why Are Convolutional Neural Networks Great For\u00a0Images?<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p class=\"wp-block-paragraph\"><mdspan datatext=\"el1746060965933\" class=\"mdspan-comment\">The<\/mdspan>\u00a0<a href=\"https:\/\/www.sciencedirect.com\/science\/article\/abs\/pii\/0893608089900208?via%3Dihub\" target=\"_blank\" rel=\"noreferrer noopener\">Universal Approximation Theorem<\/a>\u00a0states that a neural network with a single hidden layer and a nonlinear activation function can approximate any continuous function.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">Practical issues aside, such that the number of neurons in this hidden layer would grow enormously large, we do not need other network architectures. A simple feed-forward neural network could do the trick.<\/p>\n<p class=\"wp-block-paragraph\">It is challenging to estimate how many network architectures have been developed.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">When you open the popular AI model platform\u00a0<a href=\"https:\/\/huggingface.co\/models\" target=\"_blank\" rel=\"noreferrer noopener\">Hugging Face<\/a>\u00a0today, you will find more than one million pretrained models. Depending on the task, you will use different architectures, for example transformers for natural language processing and convolutional networks for image classification.<\/p>\n<p class=\"wp-block-paragraph\">So, why do we\u00a0<em>need<\/em>\u00a0so many <a href=\"https:\/\/towardsdatascience.com\/tag\/neural-network\/\" title=\"Neural Network\">Neural Network<\/a> architectures?<\/p>\n<p class=\"wp-block-paragraph\">In this post, I want offer an answer to this question from a physics perspective. It is the structure in the data that inspires novel neural network architectures.<\/p>\n<h3 class=\"wp-block-heading\">Symmetry and invariance<\/h3>\n<p class=\"wp-block-paragraph\">Physicists love symmetry. The fundamental laws of physics employ symmetries, such as the fact that the motion of a particle can be described by the same equations, regardless of where it finds itself in time and space.<\/p>\n<p class=\"wp-block-paragraph\">Symmetry always implies invariance with respect to some transformation. These ice crystals are an example of translational invariance. The smaller structures look the same, regardless of where they appear in the larger context.<\/p>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/04\/1cyok-skFlIXHo3MxRMByEQ.jpg?ssl=1\" alt=\"Regular ice crystals\" class=\"wp-image-602606\"><figcaption class=\"wp-element-caption\">By Photo by PtrQs, CC BY-SA 4.0,\u00a0<a href=\"https:\/\/commons.wikimedia.org\/w\/index.php?curid=127396876\" rel=\"noreferrer noopener\" target=\"_blank\">https:\/\/commons.wikimedia.org\/w\/index.php?curid=127396876<\/a><\/figcaption><\/figure>\n<h3 class=\"wp-block-heading\">Exploiting symmetries: convolutional neural\u00a0network<\/h3>\n<p class=\"wp-block-paragraph\">If you already know that a certain symmetry persists in your data, you can exploit this fact to simplify your neural network architecture.<\/p>\n<p class=\"wp-block-paragraph\">Let\u2019s explain this with the example of image classification. The panel shows three scenes including a goldfish. It can show up in any location within the image, but the image should always be classified as\u00a0<em>goldfish<\/em>.<\/p>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/04\/1CjKp7lOtNsaXHxLy73ul9Q.jpg?ssl=1\" alt=\"Three panels showing the same goldfish in different locations\" class=\"wp-image-602604\"><figcaption class=\"wp-element-caption\">Images created by the author using Midjourney.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">A feed-forward neural network could certainly achieve this, given sufficient training data.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">This network architecture requires a flattened input image. Weights are then assigned between each input layer neuron (representing one pixel in the image) and each hidden layer neuron. Also, weights are assigned between each neuron in the hidden and the output layer.<\/p>\n<p class=\"wp-block-paragraph\">Along with this architecture, the panel shows a \u201cflattened\u201d version of the three goldfish images from above. Do they still look alike to you?<\/p>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/04\/1gjdefs9wO-3Hrwp604OBaQ.jpg?ssl=1\" alt=\"Schematic explaining the flattening of an image and the resulting network architecture.\" class=\"wp-image-602605\"><figcaption class=\"wp-element-caption\">Image created by the author. Using images created by the author with Midjourney and ANN architecture created with\u00a0<a href=\"https:\/\/alexlenail.me\/NN-SVG\/\" rel=\"noreferrer noopener\" target=\"_blank\">https:\/\/alexlenail.me\/NN-SVG\/<\/a>.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">By flattening the image, we have incurred two problems:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Images that contain a similar object do not look alike once they are flattened,<\/li>\n<li class=\"wp-block-list-item\">For high-resolution images, we will need to train a lot of weights connecting the input layer and the hidden layer.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">Convolutional networks, on the other hand, work with kernels. Kernel sizes typically range between 3 and 7 pixels, and the kernel parameters are learnable in training.<\/p>\n<p class=\"wp-block-paragraph\">The kernel is applied like a raster to the image. A convolutional layer will have more than one kernel, allowing each kernel to focus on different aspects of the image.\u00a0<\/p>\n<figure class=\"wp-block-image\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/04\/1hcEZQnOOq2kR4oyB2DeheA.png?ssl=1\" alt=\"\" class=\"wp-image-602607\"><figcaption class=\"wp-element-caption\">Image created by the\u00a0author.<\/figcaption><\/figure>\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dotted\">\n<p class=\"wp-block-paragraph\">For example, one kernel might pick up on horizontal lines in the image, while another might pick up on convex curves.<\/p>\n<p class=\"wp-block-paragraph\">Convolutional neural networks preserve the order of pixels and are great to learn localized structures. The convolutional layers can be nested to create deep layers. In conjunction with pooling layers, high-level features can be learned.<\/p>\n<p class=\"wp-block-paragraph\">The resulting networks are considerably smaller than if you would use a fully-connected neural network. A convolutional layer only requires\u00a0<em>kernel_size x kernel_size x n_kernel<\/em>\u00a0trainable parameters.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">You can save memory and computational budget, all by exploiting the fact that your object may be located\u00a0<em>anywhere<\/em> within your image!<\/p>\n<p class=\"wp-block-paragraph\">More advanced deep learning architectures that exploit symmetries are Graph Neural Networks and physics-informed neural networks.<\/p>\n<h3 class=\"wp-block-heading\">Summary<\/h3>\n<p class=\"wp-block-paragraph\">Convolutional neural networks work great with images because they preserve the local information in your image. Instead of flattening all the pixels, rendering the image meaningless, kernels with learnable parameters pick up on local features.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dotted\">\n<h3 class=\"wp-block-heading\">Further reading<\/h3>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/www.sciencedirect.com\/science\/article\/abs\/pii\/0893608089900208?via%3Dihub\">Universal Approximation Theorem<\/a><\/li>\n<li class=\"wp-block-list-item\">Documentation for convolutional layers in\u00a0<a href=\"https:\/\/pytorch.org\/docs\/stable\/generated\/torch.nn.Conv2d.html\" target=\"_blank\" rel=\"noreferrer noopener\">Torch<\/a>\n<\/li>\n<\/ul>\n<p>The post <a href=\"https:\/\/towardsdatascience.com\/why-are-convolutional-neural-networks-great-for-images\/\">Why Are Convolutional Neural Networks Great For\u00a0Images?<\/a> appeared first on <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a>.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Caroline Arnold<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/why-are-convolutional-neural-networks-great-for-images\/\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Why Are Convolutional Neural Networks Great For\u00a0Images? The\u00a0Universal Approximation Theorem\u00a0states that a neural network with a single hidden layer and a nonlinear activation function can approximate any continuous function.\u00a0 Practical issues aside, such that the number of neurons in this hidden layer would grow enormously large, we do not need other network architectures. A simple [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,69,1898,83,88,70,1780],"tags":[2193,132,118],"class_list":["post-3480","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-artificial-intelligence","category-convolutional-network","category-data-science","category-deep-learning","category-machine-learning","category-neural-network","tag-architectures","tag-network","tag-neural"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/3480"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=3480"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/3480\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=3480"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=3480"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=3480"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}