We are currently experiencing intermittent issues impacting performance. We apologize for the inconvenience.
By Topic

LDPC Decoding on the Intel SCC

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Diavastos, A. ; Dept. of Comput. Sci., Univ. of Cyprus, Nicosia, Cyprus ; Petrides, P. ; Falcao, G. ; Trancoso, P.

Low-Density Parity-Check (LDPC) codes are powerful error correcting codes used today in communication standards such as DVB-S2 and WiMAX to transmit data inside noisy channels with high error probability. LDPC decoding is computationally demanding and requires irregular accesses to memory which makes it suitable for parallelization. The recent introduction of the many-core Single-chip Cloud Computer (SCC) from Intel research Labs has created new opportunities and also new challenges for programmers that wish to exploit conveniently the high level of parallelism available in the architecture. In this paper we propose three different implementations: a distributed, a shared and a multi-codeword implementation, for LDPC decoding algorithms that explore the Intel SCC scaling opportunities. From the experimental results we observed that the distributed memory model couldn't scale due to the large number of messages exchanged by the parallel kernels, while the shared memory model had a limited scaling due to the overhead added by the uncacheable shared memory. On the other hand, the multi-codeword implementation scales almost linearly acheving a relative throughput of 28 for 32 cores.

Published in:

Parallel, Distributed and Network-Based Processing (PDP), 2012 20th Euromicro International Conference on

Date of Conference:

15-17 Feb. 2012