Nguyen, T Q2007-07-112007-07-112007-07-112007https://infoscience.epfl.ch/handle/20.500.14299/9463Digital libraries are libraries in which collections are stored in a digital format (the metadata at least). Digital libraries are now being made publicly available. However, building good user interfaces to query heterogeneous libraries requires to have a good knowledge on the type of available information (e.g. which attributes are useful for filtering). In this project, we harvest (using the Z39.50 and OAI-PMH protocol) and analyze (in terms of useful attributes for querying) four important digital libraries: Nebis (five million items), Infoscience (sixty thousand items), CiteSeer (seven hundred thousand items) and The European Library (one and a half million items).Digital librariesdatabasesZ39.50OAI-PMHuser interfacequeryattributeentropyDigital Librariestext::report