A scalable runtime fault detection mechanism for high performance computing | IEEE Conference Publication | IEEE Xplore