Principle-based Dataflow Optimization for Communication Lower Bound in Operator-Fused Tensor Accelerator | IEEE Conference Publication | IEEE Xplore