Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines | IEEE Conference Publication | IEEE Xplore