Development of a flexible tool for the automatic comparison of bibliographic records. Application to sample collections - Développement d'un logiciel flexible pour la comparaison de notices bibliographiques et application à différentes collections

Due to the multiplication of digital bibliographic catalogues (open repositories, library and bookseller catalogues), information specialists are facing the challenge of mass-processing huge amounts of metadata for various purposes. Among the many possible applications, determining the similarity between records is an important issue. Such a similarity can be interesting from a bibliographic point of view (i.e., do the records describe the same document, the answer to which can be useful for deduplication or for collection overlap studies) as well as from a thematic point of view (suggestion of documents to the user, as well as content management within the framework of a library policy, automatic classification of documents, and so on). In order to fulfil such various needs, we propose a flexible, open-source, multiplatform software tool supporting the implementation of multiple strategies for record comparisons. In a second step, we study the relevance and performance of several algorithms applied to a selection of collections (size, origin, document types...).


Advisor(s):
Savoy, Jacques
Year:
2009
Keywords:
Note:
Diplôme universitaire de formation continue en information documentaire (CESID), Université de Genève
Laboratories:




 Record created 2009-10-13, last modified 2018-03-17

n/a:
Download fulltextPDF
External link:
Download fulltextURL
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)