Tag-based Paper Retrieval: Minimizing User Effort with Diversity Awareness

As the number of scientific papers getting published is likely to soar, most of modern paper management systems (e.g. ScienceWise, Mendeley, CiteULike) support tag-based retrieval. In that, each paper is associated with a set of \emph{tags}, allowing user to search for relevant papers by formulating tag-based queries against the system. One of the most critical issues in tag-based retrieval is that user often has difficulties in precisely formulating his information need. Addressing this issue, our paper tackles the problem of automatically suggesting new tags for user when he formulates a query. The set of tags are selected in such a way that resolves query ambiguity in two aspects: \emph{informativeness} and \emph{diversity}. While the former reduces user effort in finding the desired papers, the latter enhances the variety of information shown to user. Through studying theoretical properties of this problem, we propose a heuristic-based algorithm with several salient performance guarantees. We also demonstrate the efficiency of our approach through extensive experimentation using real-world datasets.

Presented at:
20th International Conference on Database Systems for Advanced Applications, Hanoi, Vietnam, April 20-23, 2015

 Record created 2015-02-07, last modified 2019-03-17

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)