Skip to Main Content
Performance of an algorithm mainly depends on both computer architecture and software. An Intel Xeon processor based HPC cluster and Intel Itanium2 based symmetric multiprocessing (SMP) architectures are used for performance analysis of PDE based parallel algorithm. Algorithm is parallelized using MPI and performance measurements are done using Tuning and Analysis Utilities (TAU). Computational optimization reveals data independency and helps compiler to generate more efficient program for that specific processor. Removing data dependency inside loop is the key in this work. In iterative algorithms, like Gauss-Seidel method, each processor communicates with the same processors at every iteration. This feature makes persistent connection preferable. MPI has different types of communication methods for different communication characteristics. Persistent connection is one of them. Persistent connection removes connection overhead by leaving connection open for further communications. It can be preferred if data is transferred repeatedly between same nodes. In this work source code changed to help compiler to generate more efficient program for the specific processor. Also MPI persistent connection is used for communication at each iteration in Gauss-Seidel method. In some parallel algorithms, communication must be synchronized. Making communication between processors at the same time becomes a bottleneck if communication medium is shared. This fact has been studies and analyzed.