Loop scheduling with complete memory latency hiding on multi-core architecture | IEEE Conference Publication | IEEE Xplore