Maximizing Service Reliability in Distributed Computing Systems with Random Node Failures: Theory and Implementation | IEEE Journals & Magazine | IEEE Xplore