Makarychev, KonstantinMakarychev, YurySviridenko, MaximWard, Justin2017-08-032017-08-032017-08-03201610.4230/LIPIcs.APPROX-RANDOM.2016.14https://infoscience.epfl.ch/handle/20.500.14299/139537We consider the classical k-means clustering problem in the setting of bi-criteria approximation, in which an algorithm is allowed to output beta*k > k clusters, and must produce a clustering with cost at most alpha times the to the cost of the optimal set of k clusters. We argue that this approach is natural in many settings, for which the exact number of clusters is a priori unknown, or unimportant up to a constant factor. We give new bi-criteria approximation algorithms, based on linear programming and local search, respectively, which attain a guarantee alpha(beta) depending on the number beta*k of clusters that may be opened. Our guarantee alpha(beta) is always at most 9 + epsilon and improves rapidly with beta (for example: alpha(2) < 2.59, and alpha(3) < 1.4). Moreover, our algorithms have only polynomial dependence on the dimension of the input data, and so are applicable in high-dimensional settings.k-means clusteringbicriteria approximation algorithmslinear programminglocal searchA Bi-Criteria Approximation Algorithm for k-Meanstext::conference output::conference proceedings::conference paper