Fast K-Means with Accurate Bounds

We propose a novel accelerated exact $k$-means algorithm, which outperforms the current state-of-the-art low-dimensional algorithm in 18 of 22 experiments, running up to 3$\times$ faster. We also propose a general improvement of existing state-of-the-art accelerated exact $k$-means algorithms through better estimates of the distance bounds used to reduce the number of distance calculations, obtaining speedups in 36 of 44 experiments, of up to 1.8$\times$. We have conducted experiments with our own implementations of existing methods to ensure homogeneous evaluation of performance, and we show that our implementations perform as well or better than existing available implementations. Finally, we propose simplified variants of standard approaches and show that they are faster than their fully-fledged counterparts in 59 of 62 experiments.

Related material