Skip to Main Content
The paper presents initial discussion of performance aspects associated with using various techniques of programming single instruction stream, multiple data streams (SIMD) extensions of modern general-purpose processors. The analysis presented in the paper is based on a set of simple examples exhibiting data parallelism, which have been implemented in C/C++ and which may be considered as typical basic building blocks used to create more complex applications. Various performance characteristics are analyzed with the use of low level profiling based on hardware performance counters monitoring. Obtained results are discussed in the context of architectural and organizational decisions made by processorpsilas manufacturers during the implementation of the SIMD model of parallel processing in the architecture of modern general-purpose processors.