Skip to Main Content
Over the past 20 to 30 years, the analysis of tandem mass spectrometry data generated from protein fragments has become the dominant method for the identification and classification of unknown protein samples. With wide ranging application in numerous scientific disciplines such as pharmaceutical research, cancer diagnostics, and bacterial identification, the need for accurate protein identification remains important, and the ability to produce more accurate identifications at faster rates would be of great benefit to society as a whole. As a key step towards improving the speed, and thus achievable accuracy, of protein identification algorithms, this paper presents a FPGA-based solution that considerably accelerates the Isotope Pattern Calculator, a computationally intense subroutine common in de novo protein identification. Although previous work shows incremental progress in the acceleration of software-based IPC (mainly by sacrificing accuracy for speed), to the best of our knowledge this is the first work to consider IPC on FPGAs. In this paper, we describe the design and implementation of an efficient and configurable IPC kernel. The described design provides 23 customization parameters allowing for general use within many protein identification algorithms. We discuss several parameter tradeoffs and demonstrate experimentally their effect on performance when comparing execution of optimized IPC software with various configurations of our hardware IPC solution, we demonstrate between 72 and 566 speedup on a single Stratix IV E530 FPGA. Finally, a favorable IPC configuration is scaled to multiple FPGAs, where a best-case speedup of 3340 on 16 FPGAs is observed when experimentally evaluated on a single node of Novo-G, the reconfigurable supercomputer in the NSF CHREC Center at Florida.
Date of Conference: 10-11 July 2012