In this article we present an efficient approach for the implementation of optimum maximum likelihood decoding of QPSK modulated multiple-input-multiple-output data streams. The proposed method does not compromise optimality of the detection algorithm. Instead it uses the special properties of QPSK modulation together with algebraic transformations and architectural optimizations to achieve very low hardware complexity and high speed. To our knowledge it is the fastest and most area efficient reported VLSI implementation of a hard decission Maximum Likelihood decoder for QPSK based MIMO systems. It can be applied to a variety of narrow band and wideband systems in many different configurations, including different degrees of spatial multiplexing and receive diversity.