This paper presents a novel algorithm and architecture for K-Best decoding that combines the benefits of radius shrinking commonly associated with sphere decoding and the architectural benefits associated with K-Best decoding approaches. The proposed algorithm requires much smaller K and possesses the advantages of branch pruning and adaptively updated pruning threshold while still achieving near-optimum performance. The algorithm examines a much smaller subset of points as compared to the K-Best decoder. The VLSI architecture of the decoder is based on a pipelined sorter-free scheme. The proposed K-Best decoder is designed to support a 4 × 4 64-QAM system and is synthesized with 65-nm technology at 158-MHz clock frequency and 1-V supply. The synthesized decoder can support a throughput of 285.8 Mb/s at 25-dB signal-to-noise ratio with an area of 210 kGE at 12.8-mW power consumption.