Skip to Main Content
In this paper we describe our design, implementation, and initial results of a prototype connected-phoneme—based speech recognition system on the Cell Broadband Engine™ (Cell/B.E.) processor. Automated speech recognition decodes speech samples into plaintext (other representations are possible) and must process samples at real-time rates. Fortunately, the computational tasks involved in this pipeline are highly data parallel and can receive significant hardware acceleration from vector-streaming architectures such as the Cell/B.E. Architecture. Identifying and exploiting these parallelism opportunities is challenging and critical to improving system performance. From our initial performance timings, we observed that a single Cell/B.E. processor can recognize speech from thousands of simultaneous voice channels in real time—a channel density that is orders of magnitude greater than the capacity of existing software speech recognizers based on CPUs (central processing units). This result emphasizes the potential for Cell/B.E. processor-based speech recognition and will likely lead to the development of production speech systems using Cell/B.E. processor clusters.
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.