Abstract:
Deep learning training compilers accelerate and achieve more resource-efficient training. We present a deep learning compiler for training consisting of three main featur...Show MoreMetadata
Abstract:
Deep learning training compilers accelerate and achieve more resource-efficient training. We present a deep learning compiler for training consisting of three main features, a syncfree optimizer, compiler caching and multi-threaded execution. We demonstrate speedups for common language and vision problems against native and XLA baselines implemented in PyTorch.
Published in: 2023 IEEE 4th International Conference on Pattern Recognition and Machine Learning (PRML)
Date of Conference: 04-06 August 2023
Date Added to IEEE Xplore: 14 December 2023
ISBN Information: