Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Modeling and Enhancing Human Knowledge Navigation
 
doctoral thesis

Modeling and Enhancing Human Knowledge Navigation

Arora, Akhil  
2024

Homo sapiens (Latin: "wise human") is a born knowledge seeker. While modern AI-powered search engines have revolutionized knowledge-seeking by responding to our queries in a manner that resembles natural conversation, systems that understand our knowledge-seeking needs and "take us by the hand" in navigating online knowledge are yet to be realized.

The work presented in this thesis takes the first step in the direction of realizing the next generation of information systems. Specifically, we devise a framework for modeling and enhancing human knowledge navigation in online platforms and make two major contributions. First, we develop methods for understanding and modeling human navigation on Wikipedia, the largest platform for open knowledge. Second, we devise methods and tools for mitigating content and structural knowledge gaps, thereby facilitating improvements in human knowledge navigation behavior. Overall, the methodological contributions of this thesis are organized into three parts, Parts II, III, and IV, respectively.

In Part II, we describe an information-theoretic measure for understanding the underlying dynamics of human knowledge navigation on Wikipedia. Surprisingly, we find that the majority of human navigation on Wikipedia is Markovian, and leveraging these insights, devise the first large-scale privacy-preserving model for synthesizing human-like navigation traces by relying solely on aggregate data.

In Part III, we present methods for mitigating content gaps and gaps in knowledge stores, which improve the knowledge organization of Web corpora, thereby facilitating improvements in knowledge navigation as a by-product. Focusing on enriching the textual content by grounding concepts to knowledge bases, we devise EIGENTHEMES, the first truly unsupervised entity linker that relies solely on the availability of entity names and a referent knowledge base. In order to improve knowledge bases themselves, we devise PARIS+, a probabilistic model capable of performing entity alignment for Web-scale knowledge stores on commodity hardware.

Finally, in Part IV, we present methods for mitigating structural gaps, which explicitly impact the link structure of knowledge sources and therefore provide direct enhancements to knowledge navigation. We first describe a framework to assess the causal impact of structural gaps and present methods for mitigating them. Next, to support human editors in effectively integrating new entities in linked textual corpora on the Web, we devise LOCEI, a framework to perform localized entity insertions.

We conclude by discussing the implications of our findings and presenting future research opportunities enabled by our contributions.

  • Files
  • Details
  • Metrics
Type
doctoral thesis
DOI
10.5075/epfl-thesis-9910
Author(s)
Arora, Akhil  

EPFL

Advisors
West, Robert  
Jury

Prof. Caglar Gulcehre (président) ; Prof. Robert West (directeur de thèse) ; Prof. Antoine Bosselut, Dr Leila Zia, Dr Ryen White (rapporteurs)

Date Issued

2024

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2024-07-19

Thesis number

9910

Total of pages

219

Subjects

knowledge navigation

•

knowledge gaps

•

information extraction

•

link recommendation

•

causal inference

•

logs analysis

•

knowledge graphs

•

Wikipedia

•

Web

•

online news

EPFL units
DLAB  
Faculty
IC  
School
IINFCOM  
Doctoral School
EDIC  
Available on Infoscience
July 30, 2024
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/240510
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés