The estimation of doubly-selective channels is challenging since long channel impulse response should be estimated with a fast tracking speed. Provided that a structure of the channel response is sparse, i.e., only a few of channel gains are nonzero, a tracking performance of the channel estimator can be improved significantly by avoiding estimation of zero taps. In this paper, we study estimation of fast time-varying and long reverberant channels that have a sparse structure in multi-input multi-output (MIMO) systems. In order to exploit the sparse structure, we parameterize the locations of nonzero taps using a binary vector and incorporate it into the state-space system built upon auto-regressive (AR) modeling of the time-varying channel gains. Then, we derive a joint estimate of the binary vector and channel gains based on maximum likelihood (ML) criterion. Expectation maximization (EM) algorithm is derived to find a sparse structure and channel gains iteratively. According to the simulation study performed over MIMO Rician fading channels, the proposed sparse channel estimation technique outperforms the previous estimation schemes, especially when Doppler rate is high.