Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Fekete Points

1 minute read

Published:

Interpolation is an important problem, and doing interpolation requires interpolation points.

portfolio

publications

Communication bounds for convolutional neural networks

Published in PASC '22: Proceedings of the Platform for Advanced Scientific Computing Conference, 2022

Convolutional neural networks (CNNs) are important in a wide variety of machine learning tasks and applications, so optimizing their performance is essential.

Recommended citation: Anthony Chen, James Demmel, Grace Dinh, Mason Haberle, and Olga Holtz. 2022. Communication bounds for convolutional neural networks. In Proceedings of the Platform for Advanced Scientific Computing Conference (PASC '22). Association for Computing Machinery, New York, NY, USA, Article 1, 1–10. https://dl.acm.org/doi/10.1145/3539781.3539784

talks

Communication Bounds for Convolutional Neural Networks

Published:

Convolutional neural networks (CNNs) are important in a wide variety of machine learning tasks and applications, so optimizing their performance is essential. Moving words of data between levels of a memory hierarchy or between processors on a network is much more expensive than the cost of arithmetic, so minimizing communication is critical to optimizing performance. In this paper, we present new precise lower bounds on data movement for convolutions in both single-processor and parallel distributed memory models, as well as algorithms that outperform current implementations such as Im2Col. We obtain performance figures using GEMMINI, a machine learning accelerator, where our tiling provides improvements between 13% and 150% over a vendor supplied algorithm.

Communication Bounds for Convolutional Neural Networks

Published:

Convolutional neural networks (CNNs) are important in a wide variety of machine learning tasks and applications, so optimizing their performance is essential. Moving words of data between levels of a memory hierarchy or between processors on a network is much more expensive than the cost of arithmetic, so minimizing communication is critical to optimizing performance. In this paper, we present new precise lower bounds on data movement for convolutions in both single-processor and parallel distributed memory models, as well as algorithms that outperform current implementations such as Im2Col. We obtain performance figures using GEMMINI, a machine learning accelerator, where our tiling provides improvements between 13% and 150% over a vendor supplied algorithm.

Fast Summation on a Sphere

Published:

Fast summation techniques have proven to be of great importance in a variety of fields. In this talk, I will present a new technique for performing fast summations on spheres, which is suitable for sums coming from spherical convolutions. I present applications of this technique to problems coming from atmospheric and oceanic modeling.

teaching