The QR decomposition is an important, but often underestimated prerequisite for pseudo- or non-linear detection methods such as successive interference cancellation or sphere decoding for multiple-input multiple-output (MIMO) systems. The ability of concurrent iterative sorting during the QR decomposition introduces a moderate overall latency, but provides the base for an improved layered stream decoding. This paper describes the architecture and results of the first VLSI implementation of an iterative sorted QR decomposition preprocessor for MIMO receivers. The presented architecture performs MIMO channel preprocessing using Givens rotations in order to compute the minimum mean squared error QR decomposition.