Communication-Efficient Model Parallelism for Distributed In-Situ Transformer Inference | IEEE Conference Publication | IEEE Xplore