Development of a flexible tool for the automatic comparison of bibliographic records. Application to sample collections - Développement d'un logiciel flexible pour la comparaison de notices bibliographiques et application à différentes collections

Borel, Alain; Krause, Jan

master thesis

Borel, Alain

•

Krause, Jan

2009

Due to the multiplication of digital bibliographic catalogues (open repositories, library and bookseller catalogues), information specialists are facing the challenge of mass-processing huge amounts of metadata for various purposes. Among the many possible applications, determining the similarity between records is an important issue. Such a similarity can be interesting from a bibliographic point of view (i.e., do the records describe the same document, the answer to which can be useful for deduplication or for collection overlap studies) as well as from a thematic point of view (suggestion of documents to the user, as well as content management within the framework of a library policy, automatic classification of documents, and so on). In order to fulfil such various needs, we propose a flexible, open-source, multiplatform software tool supporting the implementation of multiple strategies for record comparisons. In a second step, we study the relevance and performance of several algorithms applied to a selection of collections (size, origin, document types...).

Name

CESID-memoire_ABJK.pdf

Access type

openaccess

Size

1.12 MB

Format

Adobe PDF

Checksum (MD5)

b83d71d12b1380b8d832f3bb22adde99