By Topic

A Non-blocking Programming Framework for Pipeline Application on Multi-core Platform

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

7 Author(s)
Xiaoqiang Li ; Sch. of Comput. Sci. & Technol., Univ. of Sci. & Technol. of China, Hefei, China ; Hong An ; Gu Liu ; Wenting Han
more authors

Many applications meet certain programming patterns like pipeline, fork-join, do-all etc. While tools such as OS threads and OpenMP allow programmers only to express task or data parallelism, special support for programming patterns is distinctly lacking. Intel threading building blocks (TBB) is developed to address this problem, but its scheduler is general and not optimized for any of its parallel algorithms which include pipeline specially. In this paper, we provide a non-blocking framework for pipeline application on multi-core platform. We target linear pipeline in which each filter has one entrance and one exit. We design a novel work-stealing scheduler optimized specially for pipeline application: first, priority based stealing, priority is calculated for each filter in pipeline so that a worker can find the optimal "victim" easily when it needs to steal, second, multiple tasks can be stolen at a time so that much stealing time is reduced. A non-block queue is used to store intermediate result to reduce lock overhead and increase scalability. We apply our framework to four case studies, including text filter, two fish, ferret, ded up. And our framework reduces execution time of TBB by 72% in best case and 20% on average on an 8 core machine.

Published in:

Parallel and Distributed Processing with Applications (ISPA), 2011 IEEE 9th International Symposium on

Date of Conference:

26-28 May 2011