Optimizing Multi-Level Checkpointing for Distributed Deep Learning Workloads on Cloud Spot VM Clusters | IEEE Journals & Magazine | IEEE Xplore