Tag: performance

Breaking the Host Memory Bottleneck: How Peer Direct Transformed Gaudi’s Cloud Performance

Breaking the Host Memory Bottleneck: How Peer Direct Transformed Gaudi’s Cloud Performance Engineering RDMA-like performance over cloud host NICs using libfabric, DMA-BUF, and HCCL to restore distributed training scalability The post Breaking the Host Memory Bottleneck: How Peer Direct Transformed Gaudi’s Cloud Performance appeared first on Towards Data Science. Maria Piterberg Go to original source

February 26, 2026
Pydantic Performance: 4 Tips on How to Validate Large Amounts of Data Efficiently

Pydantic Performance: 4 Tips on How to Validate Large Amounts of Data Efficiently The real value lies in writing clearer code and using your tools right The post Pydantic Performance: 4 Tips on How to Validate Large Amounts of Data Efficiently appeared first on Towards Data Science. Mike Huls Go to original source

February 7, 2026
Distributed Reinforcement Learning for Scalable High-Performance Policy Optimization

Distributed Reinforcement Learning for Scalable High-Performance Policy Optimization Leveraging massive parallelism, asynchronous updates, and multi-machine training to match and exceed human-level performance The post Distributed Reinforcement Learning for Scalable High-Performance Policy Optimization appeared first on Towards Data Science. Sam Black Go to original source

February 2, 2026
Achieving 5x Agentic Coding Performance with Few-Shot Prompting

Achieving 5x Agentic Coding Performance with Few-Shot Prompting Learn to leverage few-shot prompting to increase your LLMs performance The post Achieving 5x Agentic Coding Performance with Few-Shot Prompting appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

January 24, 2026
How to Improve the Performance of Visual Anomaly Detection Models

How to Improve the Performance of Visual Anomaly Detection Models Apply the best methods from academia to get the most out of practical applications The post How to Improve the Performance of Visual Anomaly Detection Models appeared first on Towards Data Science. Aimira Baitieva Go to original source

January 9, 2026
7 Pandas Performance Tricks Every Data Scientist Should Know

7 Pandas Performance Tricks Every Data Scientist Should Know What I’ve learned about making Pandas faster after too many slow notebooks and frozen sessions The post 7 Pandas Performance Tricks Every Data Scientist Should Know appeared first on Towards Data Science. Benjamin Nweke Go to original source

December 12, 2025
Overcoming the Hidden Performance Traps of Variable-Shaped Tensors: Efficient Data Sampling in PyTorch

Overcoming the Hidden Performance Traps of Variable-Shaped Tensors: Efficient Data Sampling in PyTorch PyTorch Model Performance Analysis and Optimization — Part 11 The post Overcoming the Hidden Performance Traps of Variable-Shaped Tensors: Efficient Data Sampling in PyTorch appeared first on Towards Data Science. Chaim Rand Go to original source

December 4, 2025
4 Techniques to Optimize Your LLM Prompts for Cost, Latency and Performance

4 Techniques to Optimize Your LLM Prompts for Cost, Latency and Performance Learn how to greatly improve the performance of your LLM application The post 4 Techniques to Optimize Your LLM Prompts for Cost, Latency and Performance appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

October 30, 2025
A Honest Cross-Validation Estimator for Prediction Performance

A Honest Cross-Validation Estimator for Prediction Performance arXiv:2510.07649v1 Announce Type: new Abstract: Cross-validation is a standard tool for obtaining a honest assessment of the performance of a prediction model. The commonly used version repeatedly splits data, trains the prediction model on the training set, evaluates the model performance on the test set, and averages the…

October 10, 2025
The Crucial Role of NUMA Awareness in High-Performance Deep Learning

The Crucial Role of NUMA Awareness in High-Performance Deep Learning PyTorch model performance analysis and optimization — Part 10 The post The Crucial Role of NUMA Awareness in High-Performance Deep Learning appeared first on Towards Data Science. Chaim Rand Go to original source

July 10, 2025
Understanding Application Performance with Roofline Modeling

Understanding Application Performance with Roofline Modeling A common challenge with calculating an application’s performance is that the real-world performance and theoretical performance can differ. With an ecosystem of products that is growing with high performance needs such as High Performance Computing (HPC), gaming, or in the current landscape – Large Language Models (LLMs), it is…

June 21, 2025
Balancing Performance and Costs in Best Arm Identification

Balancing Performance and Costs in Best Arm Identification arXiv:2505.20583v1 Announce Type: new Abstract: We consider the problem of identifying the best arm in a multi-armed bandit model. Despite a wealth of literature in the traditional fixed budget and fixed confidence regimes of the best arm identification problem, it still remains a mystery to most practitioners…

May 28, 2025
Performance of Rank-One Tensor Approximation on Incomplete Data

Performance of Rank-One Tensor Approximation on Incomplete Data arXiv:2504.07818v1 Announce Type: new Abstract: We are interested in the estimation of a rank-one tensor signal when only a portion $varepsilon$ of its noisy observation is available. We show that the study of this problem can be reduced to that of a random matrix model whose spectral…

April 11, 2025
Confidence Intervals for Evaluation of Data Mining

Confidence Intervals for Evaluation of Data Mining arXiv:2502.07016v1 Announce Type: new Abstract: In data mining, when binary prediction rules are used to predict a binary outcome, many performance measures are used in a vast array of literature for the purposes of evaluation and comparison. Some examples include classification accuracy, precision, recall, F measures, and Jaccard…

February 12, 2025
Efficient Metric Collection in PyTorch: Avoiding the Performance Pitfalls of TorchMetrics

Efficient Metric Collection in PyTorch: Avoiding the Performance Pitfalls of TorchMetrics Metric collection is an essential part of every machine learning project, enabling us to track model performance and monitor training progress. Ideally, Metrics should be collected and computed without introducing any additional overhead to the training process. However, just like other components of the…

February 7, 2025
Statistical Uncertainty Quantification for Aggregate Performance Metrics in Machine Learning Benchmarks

Statistical Uncertainty Quantification for Aggregate Performance Metrics in Machine Learning Benchmarks arXiv:2501.04234v1 Announce Type: new Abstract: Modern artificial intelligence is supported by machine learning models (e.g., foundation models) that are pretrained on a massive data corpus and then adapted to solve a variety of downstream tasks. To summarize performance across multiple tasks, evaluation metrics are…

January 9, 2025