Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors | IEEE Conference Publication | IEEE Xplore