Sphere decoding (SD) is widely considered as one of the most promising detection schemes for multiple-input multiple-output (MIMO) communication systems. The recently proposed list sphere-decoding (LSD) algorithm is an extension of the original SD algorithm that improves the error rate performance of wireless communication systems considerably by providing soft-outputs instead of binary decisions. This paper addresses the VLSI implementation of the LSD algorithm. To this end, algorithm optimizations suitable for efficient hardware implementations are developed. The implemented circuits achieve a gain of up to 3 dB in SNR compared to hard output SDs and a throughput of up to 272 Mbps at 20 dB SNR in a 0.25 mu m technology for 4x4 MIMO systems with 16-QAM modulation.