Efficient algorithms for block-cyclic array redistribution between processor sets | IEEE Journals & Magazine | IEEE Xplore