Run-time thread sorting to expose data-level parallelism | IEEE Conference Publication | IEEE Xplore