Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Approximation algorithms for geometric dispersion
 
doctoral thesis

Approximation algorithms for geometric dispersion

Cevallos Manzano, Alfonso Bolívar  
2016

The most basic form of the max-sum dispersion problem (MSD) is as follows: given n points in R^q and an integer k, select a set of k points such that the sum of the pairwise distances within the set is maximal. This is a prominent diversity problem, with wide applications in web search and information retrieval, where one needs to find a small and diverse representative subset of a large dataset. The problem has recently received a great deal of attention in the computational geometry and operations research communities; and since it is NP-hard, research has focused on efficient heuristics and approximation algorithms. Several classes of distance functions have been considered in the literature. Many of the most common distances used in applications are induced by a norm in a real vector space. The focus of this thesis is on MSD over these geometric instances. We provide for it simple and fast polynomial-time approximation schemes (PTASs), as well as improved constant-factor approximation algorithms. We pay special attention to the class of negative-type distances, a class that includes Euclidean and Manhattan distances, among many others. In order to exploit the properties of this class, we apply several techniques and results from the theory of isometric embeddings. We explore the following variations of the MSD problem: matroid and matroid-intersection constraints, knapsack constraints, and the mixed-objective problem that maximizes a combination of the sum of pairwise distances with a submodular monotone function. In addition to approximation algorithms, we present a core-set for geometric instances of low dimension, and we discuss the efficient implementation of some of our algorithms for massive datasets, using the streaming and distributed models of computation.

  • Files
  • Details
  • Metrics
Type
doctoral thesis
DOI
10.5075/epfl-thesis-7291
Author(s)
Cevallos Manzano, Alfonso Bolívar  
Advisors
Eisenbrand, Friedrich  
Jury

Prof. János Pach (président) ; Prof. Friedrich Eisenbrand (directeur de thèse) ; Dr Ola Svensson, Dr Rico Zenklusen, Prof. Nabil Mustafa (rapporteurs)

Date Issued

2016

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2016-11-18

Thesis number

7291

Total of pages

96

Subjects

Combinatorial optimization

•

computational geometry

•

approximation algorithms

•

max-sumdispersion

•

remote clique

•

distances of negative type

•

theory of embeddings

•

convex programming

•

local search

•

core-sets.

EPFL units
DISOPT  
Faculty
SB  
School
MATHAA  
Doctoral School
EDMA  
Available on Infoscience
November 9, 2016
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/130975
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés