Taming of the Shrew: Modeling the Normal and Faulty Behaviour of Large-scale HPC Systems | IEEE Conference Publication | IEEE Xplore