{"id":3417,"date":"2025-04-29T07:02:21","date_gmt":"2025-04-29T07:02:21","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/04\/29\/numexpr-the-faster-than-numpy-library-that-no-ones-heard-of\/"},"modified":"2025-04-29T07:02:21","modified_gmt":"2025-04-29T07:02:21","slug":"numexpr-the-faster-than-numpy-library-that-no-ones-heard-of","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/04\/29\/numexpr-the-faster-than-numpy-library-that-no-ones-heard-of\/","title":{"rendered":"NumExpr: The \u201cFaster than Numpy\u201d Library Most Data Scientists Have Never Used"},"content":{"rendered":"<p>    NumExpr: The \u201cFaster than Numpy\u201d Library Most Data Scientists Have Never Used<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p class=\"wp-block-paragraph\"><mdspan datatext=\"el1745609197438\" class=\"mdspan-comment\">Browsing GitHub<\/mdspan> the other day, I came across a library I\u2019d never heard of before. It was called <strong>NumExpr<\/strong>.<\/p>\n<p class=\"wp-block-paragraph\">I was immediately interested because of some claims made about the library. In particular, it stated that for some complex numerical calculations, it was up to 15 times faster than NumPy.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">I was intrigued because, up until now, NumPy has remained unchallenged in its dominance in the numerical computation space in Python. In particular with <a href=\"https:\/\/towardsdatascience.com\/tag\/data-science\/\" title=\"Data Science\">Data Science<\/a>, NumPy is a cornerstone for machine learning, exploratory data analysis and model training. Anything we can use to squeeze out every last bit of performance in our systems will be welcomed. So, I decided to put the claims to the test myself.<\/p>\n<p class=\"wp-block-paragraph\">You can find a link to the NumExpr repository at the end of this article.<\/p>\n<h2 class=\"wp-block-heading\">What is\u00a0NumExpr?<\/h2>\n<p class=\"wp-block-paragraph\">According to its GitHub page, NumExpr is a fast numerical expression evaluator for <a href=\"https:\/\/towardsdatascience.com\/tag\/numpy\/\" title=\"Numpy\">Numpy<\/a>. Using it, expressions that operate on arrays are accelerated and use less memory than performing the same calculations in Python with other numerical libraries, such as NumPy.<\/p>\n<p class=\"wp-block-paragraph\">In addition, as it is multithreaded, NumExpr can use all your CPU cores, which generally results in substantial performance scaling compared to NumPy.<\/p>\n<h2 class=\"wp-block-heading\">Setting up a development environment<\/h2>\n<p class=\"wp-block-paragraph\">Before we start coding, let\u2019s set up our development environment. The best practice is to create a separate <a href=\"https:\/\/towardsdatascience.com\/tag\/python\/\" title=\"Python\">Python<\/a> environment where you can install any necessary software and experiment with coding, knowing that anything you do in this environment won\u2019t affect the rest of your system. I use conda for this, but you can use whatever method you know best that suits you.<\/p>\n<p class=\"wp-block-paragraph\" id=\"a48d\">If you want to go down the Miniconda route and don\u2019t already have it, you must install Miniconda first. Get it using this link:<\/p>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.anaconda.com\/docs\/main\">https:\/\/www.anaconda.com\/docs\/main<\/a><\/p>\n<p class=\"wp-block-paragraph\"><strong>1\/ Create our new dev environment and install the required libraries<\/strong><\/p>\n<ol class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\n<\/ol>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">(base) $ conda create -n numexpr_test python=3.12-y\n(base) $ conda activate numexpr\n(numexpr_test) $ pip install numexpr\n(numexpr_test) $ pip install jupyter<\/code><\/pre>\n<p class=\"wp-block-paragraph\"><strong>2\/ Start Jupyter<\/strong><br \/>Now type in\u00a0<code>jupyter notebook\u00a0<\/code>into your command prompt. You should see a jupyter notebook open in your browser. If that doesn\u2019t happen automatically, you\u2019ll likely see a screenful of information after the\u00a0<code>jupyter notebook\u00a0<\/code>command. Near the bottom, you will find a URL that you should copy and paste into your browser to launch the Jupyter Notebook.<\/p>\n<p class=\"wp-block-paragraph\">Your URL will be different to mine, but it should look something like this:-<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">http:\/\/127.0.0.1:8888\/tree?token=3b9f7bd07b6966b41b68e2350721b2d0b6f388d248cc69<\/code><\/pre>\n<h2 class=\"wp-block-heading\">Comparing NumExpr and NumPy performance<\/h2>\n<p class=\"wp-block-paragraph\">To compare the performance, we\u2019ll run a series of numerical computations using NumPy and NumExpr, and time both systems.<\/p>\n<p class=\"wp-block-paragraph\"><strong>Example 1\u200a\u2014\u200aA simple array addition calculation<\/strong><br \/>In this example, we run a vectorised addition of two large arrays 5000 times.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import numpy as np\nimport numexpr as ne\nimport timeit\n\na = np.random.rand(1000000)\nb = np.random.rand(1000000)\n\n# Using timeit with lambda functions\ntime_np_expr = timeit.timeit(lambda: 2*a + 3*b, number=5000)\ntime_ne_expr = timeit.timeit(lambda: ne.evaluate(\"2*a + 3*b\"), number=5000)\n\nprint(f\"Execution time (NumPy): {time_np_expr} seconds\")\nprint(f\"Execution time (NumExpr): {time_ne_expr} seconds\")\n\n&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;\n\n\nExecution time (NumPy): 12.03680682599952 seconds\nExecution time (NumExpr): 1.8075962659931974 seconds<\/code><\/pre>\n<p class=\"wp-block-paragraph\">I have to say, that\u2019s a pretty impressive start from the NumExpr library already. I make that a 6 times improvement over the NumPy runtime.<\/p>\n<p class=\"wp-block-paragraph\">Let\u2019s double-check that both operations return the same result set.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">\n# Arrays to store the results\nresult_np = 2*a + 3*b\nresult_ne = ne.evaluate(\"2*a + 3*b\")\n\n# Ensure the two new arrays are equal\narrays_equal = np.array_equal(result_np, result_ne)\nprint(f\"Arrays equal: {arrays_equal}\")\n\n&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;\n\nArrays equal: True<\/code><\/pre>\n<p class=\"wp-block-paragraph\"><strong>Example 2\u200a\u2014\u200aCalculate Pi using a Monte Carlo simulation<\/strong><\/p>\n<p class=\"wp-block-paragraph\">Our second example will examine a more complicated use case with more real-world applications.<\/p>\n<p class=\"wp-block-paragraph\">Monte Carlo simulations involve running many iterations of a random process to estimate a system\u2019s properties, which can be computationally intensive.<\/p>\n<p class=\"wp-block-paragraph\">In this case, we\u2019ll use Monte Carlo to calculate the value of Pi. This is a well-known example where we take a square with a side length of one unit and inscribe a quarter circle inside it with a radius of one unit. The ratio of the quarter circle\u2019s area to the square\u2019s area is <em>(\u03c0<\/em>\/4)\/1, and we can multiply this expression by four to get <strong><em>\u03c0<\/em><\/strong><em> <\/em>on its own.<\/p>\n<p class=\"wp-block-paragraph\">So, if we consider numerous random (x,y) points that all lie within or on the bounds of the square, as the total number of these points tends to infinity, the ratio of points that lie on or inside the quarter circle to the total number of points tends towards Pi.<\/p>\n<p class=\"wp-block-paragraph\">First, the NumPy implementation.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import numpy as np\nimport timeit\n\ndef monte_carlo_pi_numpy(num_samples):\n    x = np.random.rand(num_samples)\n    y = np.random.rand(num_samples)\n    inside_circle = (x**2 + y**2) &lt;= 1.0\n    pi_estimate = (np.sum(inside_circle) \/ num_samples) * 4\n    return pi_estimate\n\n# Benchmark the NumPy version\nnum_samples = 1000000\ntime_np_expr = timeit.timeit(lambda: monte_carlo_pi_numpy(num_samples), number=1000)\npi_estimate = monte_carlo_pi_numpy(num_samples)\n\nprint(f\"Estimated Pi (NumPy): {pi_estimate}\")\nprint(f\"Execution Time (NumPy): {time_np_expr} seconds\")\n\n&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;\n\nEstimated Pi (NumPy): 3.144832\nExecution Time (NumPy): 10.642843848007033 seconds<\/code><\/pre>\n<p class=\"wp-block-paragraph\">Now, using NumExpr.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import numpy as np\nimport numexpr as ne\nimport timeit\n\ndef monte_carlo_pi_numexpr(num_samples):\n    x = np.random.rand(num_samples)\n    y = np.random.rand(num_samples)\n    inside_circle = ne.evaluate(\"(x**2 + y**2) &lt;= 1.0\")\n    pi_estimate = (np.sum(inside_circle) \/ num_samples) * 4  # Use NumPy for summation\n    return pi_estimate\n\n# Benchmark the NumExpr version\nnum_samples = 1000000\ntime_ne_expr = timeit.timeit(lambda: monte_carlo_pi_numexpr(num_samples), number=1000)\npi_estimate = monte_carlo_pi_numexpr(num_samples)\n\nprint(f\"Estimated Pi (NumExpr): {pi_estimate}\")\nprint(f\"Execution Time (NumExpr): {time_ne_expr} seconds\")\n\n&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;\n\nEstimated Pi (NumExpr): 3.141684\nExecution Time (NumExpr): 8.077501275009126 seconds<\/code><\/pre>\n<p class=\"wp-block-paragraph\">OK, so the speed-up was not as impressive that time, but a 20% improvement isn\u2019t terrible either. Part of the reason is that NumExpr doesn\u2019t have an optimised SUM() function, so we had to default back to NumPy for that operation.<\/p>\n<p class=\"wp-block-paragraph\"><strong>Example 3\u200a\u2014\u200aImplementing a Sobel image filter<\/strong><\/p>\n<p class=\"wp-block-paragraph\">In this example, we\u2019ll implement a Sobel filter for images. The Sobel filter is commonly used in image processing for edge detection. It calculates the image intensity gradient at each pixel, highlighting edges and intensity transitions. Our input image is of the Taj Mahal in India.<\/p>\n<figure class=\"wp-block-image alignwide size-large\"><img data-recalc-dims=\"1\" height=\"678\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/04\/image-123-1024x678.png?resize=1024%2C678&#038;ssl=1\" alt=\"\" class=\"wp-image-602239\"><figcaption class=\"wp-element-caption\"><strong>Original image by Yury Taranik (licensed from Shutterstock)<\/strong><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Let\u2019s see the NumPy code running first and time it.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import numpy as np\nfrom scipy.ndimage import convolve\nfrom PIL import Image\nimport timeit\n\n# Sobel kernels\nsobel_x = np.array([[-1, 0, 1],\n                    [-2, 0, 2],\n                    [-1, 0, 1]])\n\nsobel_y = np.array([[-1, -2, -1],\n                    [ 0,  0,  0],\n                    [ 1,  2,  1]])\n\ndef sobel_filter_numpy(image):\n    \"\"\"Apply Sobel filter using NumPy.\"\"\"\n    img_array = np.array(image.convert('L'))  # Convert to grayscale\n    gradient_x = convolve(img_array, sobel_x)\n    gradient_y = convolve(img_array, sobel_y)\n    gradient_magnitude = np.sqrt(gradient_x**2 + gradient_y**2)\n    gradient_magnitude *= 255.0 \/ gradient_magnitude.max()  # Normalize to 0-255\n    \n    return Image.fromarray(gradient_magnitude.astype(np.uint8))\n\n# Load an example image\nimage = Image.open(\"\/mnt\/d\/test\/taj_mahal.png\")\n\n# Benchmark the NumPy version\ntime_np_sobel = timeit.timeit(lambda: sobel_filter_numpy(image), number=100)\nsobel_image_np = sobel_filter_numpy(image)\nsobel_image_np.save(\"\/mnt\/d\/test\/sobel_taj_mahal_numpy.png\")\n\nprint(f\"Execution Time (NumPy): {time_np_sobel} seconds\")\n\n&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;\n\nExecution Time (NumPy): 8.093792188999942 seconds<\/code><\/pre>\n<p class=\"wp-block-paragraph\">And now the NumExpr code.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import numpy as np\nimport numexpr as ne\nfrom scipy.ndimage import convolve\nfrom PIL import Image\nimport timeit\n\n# Sobel kernels\nsobel_x = np.array([[-1, 0, 1],\n                    [-2, 0, 2],\n                    [-1, 0, 1]])\n\nsobel_y = np.array([[-1, -2, -1],\n                    [ 0,  0,  0],\n                    [ 1,  2,  1]])\n\ndef sobel_filter_numexpr(image):\n    \"\"\"Apply Sobel filter using NumExpr for gradient magnitude computation.\"\"\"\n    img_array = np.array(image.convert('L'))  # Convert to grayscale\n    gradient_x = convolve(img_array, sobel_x)\n    gradient_y = convolve(img_array, sobel_y)\n    gradient_magnitude = ne.evaluate(\"sqrt(gradient_x**2 + gradient_y**2)\")\n    gradient_magnitude *= 255.0 \/ gradient_magnitude.max()  # Normalize to 0-255\n    \n    return Image.fromarray(gradient_magnitude.astype(np.uint8))\n\n# Load an example image\nimage = Image.open(\"\/mnt\/d\/test\/taj_mahal.png\")\n\n# Benchmark the NumExpr version\ntime_ne_sobel = timeit.timeit(lambda: sobel_filter_numexpr(image), number=100)\nsobel_image_ne = sobel_filter_numexpr(image)\nsobel_image_ne.save(\"\/mnt\/d\/test\/sobel_taj_mahal_numexpr.png\")\n\nprint(f\"Execution Time (NumExpr): {time_ne_sobel} seconds\")\n\n&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;\n\nExecution Time (NumExpr): 4.938702256011311 seconds<\/code><\/pre>\n<p class=\"wp-block-paragraph\">On this occasion, using NumExpr led to a great result, with a performance that was close to double that of NumPy.<\/p>\n<p class=\"wp-block-paragraph\">Here is what the edge-detected image looks like.<\/p>\n<figure class=\"wp-block-image alignwide size-large\"><img data-recalc-dims=\"1\" height=\"678\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/04\/image-124-1024x678.png?resize=1024%2C678&#038;ssl=1\" alt=\"\" class=\"wp-image-602240\"><figcaption class=\"wp-element-caption\"><strong>Image by Author<\/strong><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\"><strong>Example 4\u200a\u2014\u200a Fourier series approximation<\/strong><\/p>\n<p class=\"wp-block-paragraph\">It\u2019s well known that complex periodic functions can be simulated by applying a series of sine waves superimposed on each other. At the extreme, even a square wave can be easily modelled in this way. The method is called the Fourier series approximation. Although an approximation, we can get as close to the target wave shape as memory and computational capacity allow.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">The maths behind all this isn\u2019t the primary focus. Just be aware that when we increase the number of iterations, the run-time of the solution rises markedly.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import numpy as np\nimport numexpr as ne\nimport time\nimport matplotlib.pyplot as plt\n\n# Define the constant pi explicitly\npi = np.pi\n\n# Generate a time vector and a square wave signal\nt = np.linspace(0, 1, 1000000) # Reduced size for better visualization\nsignal = np.sign(np.sin(2 * np.pi * 5 * t))\n\n# Number of terms in the Fourier series\nn_terms = 10000\n\n# Fourier series approximation using NumPy\nstart_time = time.time()\napprox_np = np.zeros_like(t)\nfor n in range(1, n_terms + 1, 2):\n    approx_np += (4 \/ (np.pi * n)) * np.sin(2 * np.pi * n * 5 * t)\nnumpy_time = time.time() - start_time\n\n# Fourier series approximation using NumExpr\nstart_time = time.time()\napprox_ne = np.zeros_like(t)\nfor n in range(1, n_terms + 1, 2):\n    approx_ne = ne.evaluate(\"approx_ne + (4 \/ (pi * n)) * sin(2 * pi * n * 5 * t)\", local_dict={\"pi\": pi, \"n\": n, \"approx_ne\": approx_ne, \"t\": t})\nnumexpr_time = time.time() - start_time\n\nprint(f\"NumPy Fourier series time: {numpy_time:.6f} seconds\")\nprint(f\"NumExpr Fourier series time: {numexpr_time:.6f} seconds\")\n\n# Plotting the results\nplt.figure(figsize=(10, 6))\n\nplt.plot(t, signal, label='Original Signal (Square Wave)', color='black', linestyle='--')\nplt.plot(t, approx_np, label='Fourier Approximation (NumPy)', color='blue')\nplt.plot(t, approx_ne, label='Fourier Approximation (NumExpr)', color='red', linestyle='dotted')\n\nplt.title('Fourier Series Approximation of a Square Wave')\nplt.xlabel('Time')\nplt.ylabel('Amplitude')\nplt.legend()\nplt.grid(True)\nplt.show()<\/code><\/pre>\n<p class=\"wp-block-paragraph\">And the output?<\/p>\n<figure class=\"wp-block-image alignwide size-large\"><img data-recalc-dims=\"1\" height=\"692\" width=\"1024\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/04\/image-125-1024x692.png?resize=1024%2C692&#038;ssl=1\" alt=\"\" class=\"wp-image-602241\"><figcaption class=\"wp-element-caption\"><strong>Image by Author<\/strong><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">That is another pretty good result. NumExpr shows a 5 times improvement over Numpy on this occasion.<\/p>\n<h3 class=\"wp-block-heading\">Summary<\/h3>\n<p class=\"wp-block-paragraph\">NumPy and NumExpr are both powerful libraries used for Python numerical computations. They each have unique strengths and use cases, making them suitable for different types of tasks. Here, we compared their performance and suitability for specific computational tasks, focusing on examples such as simple array addition to more complex applications, like using a Sobel filter for image edge detection.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">While I didn\u2019t quite see the claimed 15x speed increase over NumPy in my tests, there\u2019s no doubt that NumExpr can be significantly faster than NumPy in many cases.<\/p>\n<p class=\"wp-block-paragraph\">If you\u2019re a heavy user of NumPy and need to extract every bit of performance from your code, I recommend trying the NumExpr library. Besides the fact that not all NumPy code can be replicated using NumExpr, there\u2019s practically no downside, and the upside might surprise you.<\/p>\n<p class=\"wp-block-paragraph\">For more details on the NumExpr library, check out the GitHub page <a href=\"https:\/\/github.com\/pydata\/numexpr\">here<\/a>.<\/p>\n<p>The post <a href=\"https:\/\/towardsdatascience.com\/numexpr-the-faster-than-numpy-library-that-no-ones-heard-of\/\">NumExpr: The \u201cFaster than Numpy\u201d Library Most Data Scientists Have Never Used<\/a> appeared first on <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a>.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Thomas Reid<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/numexpr-the-faster-than-numpy-library-that-no-ones-heard-of\/\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>NumExpr: The \u201cFaster than Numpy\u201d Library Most Data Scientists Have Never Used Browsing GitHub the other day, I came across a library I\u2019d never heard of before. It was called NumExpr. I was immediately interested because of some claims made about the library. In particular, it stated that for some complex numerical calculations, it was [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,401,83,913,160,157,280],"tags":[2500,2501,291],"class_list":["post-3417","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-data-engineering","category-data-science","category-numpy","category-programming","category-python","category-technology","tag-numexpr","tag-numpy","tag-use"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/3417"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=3417"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/3417\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=3417"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=3417"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=3417"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}