Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM | IEEE Conference Publication | IEEE Xplore