By Topic

A fast, efficient parallel-acting method of generating functions defined by power series, including logarithm, exponential, and sine, cosine

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Mandelbaum, D.M. ; 168 Hollingston Pl., East Windsor, NJ, USA ; Mandelbaum, S.G.

A fundamental parallel procedure of implementing certain algorithms is by means of trees and arrays. A method of generating any function defined by a power series in a fast, efficient parallel-acting manner using trees and arrays is described. The power series considered can be written as f(Y)=a0+a1Y+a2Y2 +...where Y=v1x+V2x2+...+vk xk,vi=(0, 1), is a binary fraction when x=1/2. The power series must be expanded into individual terms cx1 . These terms are then transformed into weighted binary terms. Two methods are given to obtain all the individual terms (including coefficients) associated with each power of x. The hardware required for implementation is a tree similar to a Wallace or Dadda tree used for parallel multiplication of two binary numbers. Despite the multiplicity of terms required, Boolean logic methods reduce the tree dimensions in many cases so that the total tree required is smaller than an existing multiplier tree. In that case, Schwarz and Flynn (1993), have shown that the required tree can be superimposed on the existing multiplier tree in a multiplexed manner with relatively little increase in hardware. The generation of the logarithmic function is described in detail. Comparisons with other methods are made for the case of 11 bit accuracy of the logarithm. Using a figure of merit of latency times area (number of transistors), estimates show that the superposition scheme gives the best (smallest) figure of merit. For 11 bit accuracy, the superposition scheme requires only about 480 additional gates to be superimposed upon a 41 bit or larger multiplier, and the speed of operation is that of the multiplier

Published in:

Parallel and Distributed Systems, IEEE Transactions on  (Volume:7 ,  Issue: 1 )