Understanding and optimizing the performance of distributed machine learning applications on apache spark | IEEE Conference Publication | IEEE Xplore