Skip to Main Content
Although there are many neural network FPGA architectures, there is no framework for designing large, high-performance neural networks suitable for the real world. In this paper, we present two concepts to support a multi-FPGA architecture for stochastic restricted Boltzmann machines (RBM), a popular type of neural network. First, a hardware core, called the kth stage piecewise linear interpolator, is used to implement a high-precision, pipelined function generator. The interpolator increases the resolution of a look up table implementation, guaranteeing an additional bit of precision for every pipeline stage. This function generator is used to implement a sigmoid function required in stochastic node selection. Next, a partitioning algorithm is used to efficiently divide a RBM amongst multiple FPGAs. The partitioning algorithm optimizes performance by minimizing the inter-FPGA communication. The architecture is tested on the Berkeley Emulation Engine 2 running at 100 MHz. One board supports a RBM of 256 times 256 nodes, and results in a computational speed of 1.85 billion connection-updatesper- second and a speed-up of 85-fold over an optimized C program running on a 2.8 GHz Intel processor.