Skip to Main Content
This paper addresses the potential speedup achieved by using decimal floating-point hardware, instead of software routines, on a high-performance superscalar architecture. Software routines were written to perform decimal addition, subtraction, multiplication, and division. Cycle counts were then measured for each instruction using the Simplescalar simulator. After this, new hardware algorithms were developed, existing hardware algorithms were analyzed, and cycle counts were estimated for the same set of instructions using specialized decimal floating-point hardware. This data was then used to show the potential speedup obtained for programs with different instruction mixes and a previously developed benchmark.