Debunking Misinformation on the Web: Detection, Validation, and Visualisation

Nguyên, Thành Tâm

doi:10.5075/epfl-thesis-9694

doctoral thesis

Debunking Misinformation on the Web: Detection, Validation, and Visualisation

2019

Our modern society is struggling with an unprecedented amount of online misinformation, which does harm to democracy, economics, and cybersecurity. Journalism and politics have been impacted by misinformation on a global scale, with weakened public trust in governments seen during the Brexit referendum and viral fake election stories outperforming genuine news on social media during the 2016 U.S. presidential election campaign. Online misinformation also single-handedly caused $136.5 billion in losses in the stock market value through a single tweet about explosions in the White House. Such attacks are even driven by the advances of modern artificial intelligence (AI) these days and pose a new and ever-evolving cyber threat operating at the information level, which is far more advanced than traditional cybersecurity attacks at the hardware and software levels.

Research in this area is still in its infancy but demonstrates that debunking misinformation on the Web is a formidable challenge. This is due to several reasons. First, the open nature of social platforms such as Facebook and Twitter allows users to freely produce and propagate any content without authentication, and this has been exploited to spread hundreds of thousands of fake news at a rate of more than three million social posts per minute. Second, those responsible for the spread of misinformation harvest the power of AI attacking models to mix and disguise falsehoods with common news. Methods of camouflage are used to cover digital footprints through synthesizing millions of fake accounts and appearing to participate in normal social interactions with other users. Third, innocent users, without proper alerts from algorithmic models, can accidentally spread misinformation in an exponential wave of shares, posts, and articles. The misinformation wave is often only detected when already beyond control and consequently can cause large-scale effects in a very short time.

The overarching goal of this thesis is to help media organizations, governments, the public, and academia build a misinformation debunking framework, where algorithmic models and human validators are seamlessly and cost-effectively integrated to prevent the damage of misinformation from occurring. This thesis investigates three important components of such a framework. 1) Detection: Early detection can potentially prevent the spread of misinformation from occurring by flagging suspicious news for human attention; however it remains, to date, an unsolved challenge. 2)Validation: Learning a good detection model already requires a lot of training data, and yet it can be outdated swiftly with new social trends. A promising approach is to use human experts to validate the detection results, helping algorithmic models to train themselves to become smarter and adaptive to new traits of misinformation. 3) Visualisation: Disseminating the debunking reports is an important step to raise public awareness against falsehood contents and educate Web users. However, human users can be easily overwhelmed by the high volume of Web data, as the level of redundancy increases and the value density decreases.

In summary, this thesis proposed key components of building a misinformation debunking framework. The proposed techniques improve upon the state-of-the-art in a variety of misinformation domains, including rumours, Web claims, and social streams.

Name

EPFL_TH9694.pdf

Access type

openaccess

Size

3.18 MB

Format

Adobe PDF

Checksum (MD5)

cafc799cd63109bdfac2d601129eb0e9