Abstract:
Due to its relevant role in many numerical methods, the solution of sparse triangular linear systems (SpTRSV) in parallel platforms is continuously studied to extract as ...Show MoreMetadata
Abstract:
Due to its relevant role in many numerical methods, the solution of sparse triangular linear systems (SpTRSV) in parallel platforms is continuously studied to extract as much performance as possible from the latest hardware architectures. In the case of GPUs, the latest solvers use the synchronization-free paradigm. When the problem involves several system solutions for the same matrix, they often pre-process it through a levelset analysis to improve the equation solution scheduling in the solution phase. In addition, other optimizations address the load balancing issues and irregular memory access of the SpTRSV. In this work, we modify the classical approach to compute the level sets used in the parallel SpTRSV computation, and we show that the new strategy generally reduces the computation time of the solver. Furthermore, we design an internal matrix representation that can significantly accelerate the solution stage at the cost of increasing the memory storage requirements of the algorithm. The experimental evaluation shows that the proposed modifications can improve the performance of a recent levelset and synchronization-free solver by up to 70%, significantly outperforming other state-of-the-art solvers, especially when several linear systems must be solved for each analysis phase.
Published in: 2024 IEEE 36th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)
Date of Conference: 13-15 November 2024
Date Added to IEEE Xplore: 27 November 2024
ISBN Information: