Implementation and optimization of a thermal Lattice Boltzmann algorithm on a multi-GPU cluster | IEEE Conference Publication | IEEE Xplore