The objective of this paper is to present the current evidence relative to the effectiveness of pair programming (PP) as a pedagogical tool in higher education CS/SE courses. We performed a systematic literature review (SLR) of empirical studies that investigated factors affecting the effectiveness of PP for CS/SE students and studies that measured the effectiveness of PP for CS/SE students. Seventy-four papers were used in our synthesis of evidence, and 14 compatibility factors that can potentially affect PP's effectiveness as a pedagogical tool were identified. Results showed that students' skill level was the factor that affected PP's effectiveness the most. The most common measure used to gauge PP's effectiveness was time spent on programming. In addition, students' satisfaction when using PP was overall higher than when working solo. Our meta-analyses showed that PP was effective in improving students' grades on assignments. Finally, in the studies that used quality as a measure of effectiveness, the number of test cases succeeded, academic performance, and expert opinion were the quality measures mostly applied. The results of this SLR show two clear gaps in this research field: 1) a lack of studies focusing on pair compatibility factors aimed at making PP an effective pedagogical tool and 2) a lack of studies investigating PP for software design/modeling tasks in conjunction with programming tasks.