Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Indexing and selection of data items in huge data sets by constructing and accessing tag collections
 
conference paper

Indexing and selection of data items in huge data sets by constructing and accessing tag collections

Ponce, Sebastien
•
Vila, P.M.
•
Hersch, R.D.  
2002
19th IEEE Symposium on Mass Storage Systems & Tenth Goddard Conf. on Mass Storage Systems and Technologies

We present here a new way of indexing and retrieving data in huge datasets having a high dimensionality. The proposed method speeds up the selecting process by replacing scans of the whole data by scans of matching data. It makes use of two levels of catalogs that allow efficient data preselections. First level catalogs only contain a small subset of the data items selected according to given criteria. The first level catalogs allow to carry out queries and to preselect items. Then, a refined query can be carried out on the preselected data items within the full dataset. A second level catalog maintains the list of existing first level catalogs and the type and kind of data items they are storing. We established a mathematical model of our indexing technique and show that it considerably speeds up the access to LHCb experiment event data at CERN (European Laboratory for Particle Physics).

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

iasodiihdsbcaatc.pdf

Access type

openaccess

Size

235.72 KB

Format

Adobe PDF

Checksum (MD5)

892eab60c37f30ea8b909573c15a202f

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés