BirdMoE: Reducing Communication Costs for Mixture-of-Experts Training Using Load-Aware Bi-random Quantization | IEEE Conference Publication | IEEE Xplore