Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Efficient Document Filtering Using Vector Space Topic Expansion and Pattern-Mining: The Case of Event Detection in Microposts
 
conference paper

Efficient Document Filtering Using Vector Space Topic Expansion and Pattern-Mining: The Case of Event Detection in Microposts

Proskurnia, Julia  
•
Mavlyutov, Ruslan
•
Castillo, Carlos
Show more
2017
Proceedings of the 26th Conference on Information and Knowledge Management
26th Conference on Information and Knowledge Management (CIKM'17)

Automatically extracting information from social media is challenging given that social content is often noisy, ambiguous, and inconsistent. However, as many stories break on social channels first before being picked up by mainstream media, developing methods to better handle social content is of utmost importance. In this paper, we propose a robust and effective approach to automatically identify microposts related to a specific topic defined by a small sample of reference documents. Our framework extracts clusters of semantically similar microposts that overlap with the reference documents, by extracting combinations of key features that define those clusters through frequent pattern mining. This allows us to construct compact and interpretable representations of the topic, dramatically decreasing the computational burden compared to classical clustering and k-NN-based machine learning techniques and producing highly-competitive results even with small training sets (less than 1'000 training objects). Our method is efficient and scales gracefully with large sets of incoming microposts. We experimentally validate our approach on a large corpus of over 60M microposts, showing that it significantly outperforms state-of-the-art techniques.

  • Files
  • Details
  • Metrics
Type
conference paper
DOI
10.1145/3132847.3133016
Author(s)
Proskurnia, Julia  
Mavlyutov, Ruslan
Castillo, Carlos
Aberer, Karl  
Cudré-Mauroux, Philippe
Date Issued

2017

Published in
Proceedings of the 26th Conference on Information and Knowledge Management
Start page

457

End page

466

Subjects

Event detection

•

microposts

•

frequent patterns mining

•

semantic attributes.

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LSIR  
Event nameEvent placeEvent date
26th Conference on Information and Knowledge Management (CIKM'17)

Singapore

November 6-10, 2017

Available on Infoscience
August 17, 2017
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/139693
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés