Skip to Main Content
Fine-grained dataflow processing of sparse matrix-solve computation (AxÂ¿ = bÂ¿) in the SPICE circuit simulator can provide an order of magnitude performance improvement on modern FPGAs. Matrix solve is the dominant component of the simulator especially for large circuits and is invoked repeatedly during the simulation, once for every iteration. We process sparse-matrix computation generated from the SPICE-oriented KLU solver in dataflow fashion across multiple spatial floating-point operators coupled to high-bandwidth on-chip memories and interconnected by a low-latency network. Using this approach, we are able to show speedups of 1.2-64Ã (geometric mean of 8.8Ã) for a range of circuits and benchmark matrices when comparing double-precision implementations on a 250 MHz Xilinx Virtex-5 FPGA (65 nm) and an Intel Core i7 965 processor (45 nm).