A scalable pipelined VLSI architecture for K-best lattice decoders featuring an efficient operation over infinite lattices is proposed. The proposed architecture operates at a significantly lower complexity than currently reported schemes. The key contribution is a means of expanding/visiting the intermediate nodes of the search tree on-demand, rather than exhaustively along with three types of distributed sorters operating in a pipelined structure. The combined expansion and sorting cores are able to find the K best candidates in just K clock cycles. Its support of the unbounded lattice decoding distinguishes our work from previous K-best strategies. Since the expansion and sorting cores cooperate on a data-driven basis, the architecture is well-suited for a pipelined parallel VLSI implementation. The proposed architecture has the lowest latency reported to-date, fixed critical path independent of the constellation order, on-demand expansion scheme, efficient distribute sorters, pipelined high-throughput implementation, and is scalable to higher number of antennas/constellation orders.