ECCheck: Enhancing In-Memory Checkpoint with Erasure Coding in Distributed DNN Training | IEEE Conference Publication | IEEE Xplore