Skip to Main Content
Low latency, high throughput and small area are three major design considerations of an FPGA (field programmable gate array) design. In this paper, we present a high radix SRT division algorithm and a binary restoring square root algorithm. We describe three implementations of floating-point division operations with a variable width and precision based on Virtex-2 FPGAs. One is a low cost iterative implementation; another is a low latency array implementation; and the third is a high throughput pipelined implementation. The implementations of floating-point square root operations are presented as well. In addition to the design of modules, we also analyze the tradeoffs among the cost, latency and throughput with strategies on how to reduce the cost or improve the performance.