Skip to Main Content
Detection of complex objects in streaming video poses two fundamental challenges: training from sparse data with proper generalization across variations in the object class and the environment; and the computational power required of the trained classifier running real-time. The Kerneltron supports the generalization performance of a support vector machine (SVM) and offers the bandwidth and efficiency of a massively parallel architecture. The mixed-signal very large-scale integration (VLSI) processor is dedicated to the most intensive of SVM operations: evaluating a kernel over large numbers of vectors in high dimensions. At the core of the Kerneltron is an internally analog, fine-grain computational array performing externally digital inner-products between an incoming vector and each of the stored support vectors. The three-transistor unit cell in the array combines single-bit dynamic storage, binary multiplication, and zero-latency analog accumulation. Precise digital outputs are obtained through oversampled quantization of the analog array outputs combined with bit-serial unary encoding of the digital inputs. The 256 input, 128 vector Kerneltron measures 3 mm×3mm in 0.5 μm CMOS, delivers 6.5 GMACS throughput at 5.9 mW power, and attains 8-bit output resolution.