In this paper, we present a novel approach to perform joint detection and decoding for spatial multiplexing multiple-input multiple-output (MIMO) systems which utilize convolutional codes. The bit error rate (BER) performance of the proposed approach is significantly better than that of systems which utilize separate detection and decoding blocks. Formal algorithms with two possible system setups are presented and their performance documented. In particular, for a reference 4 × 4, 16-QAM system using a rate 1/2 convolutional code with generator polynomial [247, 371] and a constraint length of 8, improvements in signal-to-noise ratio (SNR) of 2.5 dB and 3 dB are achieved over conventional soft decoding at a BER of 10-5. The proof of concept VLSI architecture for one algorithm is provided and a novel way to reduce memory usage is demonstrated. Results indicate that better performance over conventional systems is achievable with comparable hardware complexity. The proposed design was synthesized and layout with 65-nm CMOS technology at 181-MHz clock frequency. An average throughput of 216.9 Mbps at a SNR of 13 dB with area equivalent to 553 Kgates was achieved.