Training Overhead Ratio: A Practical Reliability Metric for Large Language Model Training Systems | IEEE Conference Publication | IEEE Xplore