Data Summarization with Social Contexts

Zhuang, Hao; Rahman, Rameez; Hu, Xia; Guo, Tian; Hui, Pan; Aberer, Karl

doi:10.1145/2983323.2983736

Zhuang, Hao; Rahman, Rameez; Hu, Xia; Guo, Tian; Hui, Pan; Aberer, Karl

2016

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

While social data is being widely used in various applications such as sentiment analysis and trend prediction, its sheer size also presents great challenges for storing, sharing and processing such data. These challenges can be addressed by data summarization which transforms the original dataset into a smaller, yet still useful, subset. Existing methods find such subsets with objective functions based on data properties such as representativeness or informativeness but do not exploit social contexts, which are distinct characteristics of social data. Further, till date very little work has focused on topic preserving data summarization, despite the abundant work on topic modeling. This is a challenging task for two reasons. First, since topic model is based on latent variables, existing methods are not well-suited to capture latent topics. Second, it is difficult to find such social contexts that provide valuable information for building effective topic-preserving summarization model. To tackle these challenges, in this paper, we focus on exploiting social contexts to summarize social data while preserving topics in the original dataset. We take Twitter data as a case study. Through analyzing Twitter data, we discover two social contexts which are important for topic generation and dissemination, namely (i) CrowdExp topic score that captures the influence of both the crowd and the expert users in Twitter and (ii) Retweet topic score that captures the influence of Twitter users' actions. We conduct extensive experiments on two real-world Twitter datasets using two applications. The experimental results show that, by leveraging social contexts, our proposed solution can enhance topic-preserving data summarization and improve application performance by up to 18%.

Details

Title Data Summarization with Social Contexts

Author(s) Zhuang, Hao ; Rahman, Rameez ; Hu, Xia ; Guo, Tian ; Hui, Pan ; Aberer, Karl

Published in Cikm'16: Proceedings Of The 2016 Acm Conference On Information And Knowledge Management

Pagination 10

Pages 397-406

Conference 25th ACM Conference on Information and Knowledge Management (CIKM), Indianapolis, IN, USA, October 24-28, 2016

Date 2016

Publisher New York, Assoc Computing Machinery

ISBN 978-1-4503-4073-1
978-1-4503-4073-1

Keywords

Data Summarization; Social Context; Submodular Optimization; Topic Model

DOI https://doi.org/10.1145/2983323.2983736

Other identifier(s) View record in Web of Science

Laboratories LSIR

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > LSIR - Distributed Information Systems Laboratory
Peer-reviewed publications
Conference Papers
Work produced at EPFL
Published

Record creation date 2016-08-17

Actions

Preview

Select file: