Speaker Inconsistency Detection in Tampered Video

Korshunov, Pavel; Marcel, Sébastien

doi:10.23919/EUSIPCO.2018.8553270

conference paper

Speaker Inconsistency Detection in Tampered Video

Korshunov, Pavel

•

Marcel, Sébastien

2018

2018 26th European Signal Processing Conference (EUSIPCO)

European Signal Processing Conference

With the increasing amount of video being consumed by people daily, there is a danger of the rise in maliciously modified video content (i.e., 'fake news') that could be used to damage innocent people or to impose a certain agenda, e.g., meddle in elections. In this paper, we consider audio manipulations in video of a person speaking to the camera. Such manipulation is easy to perform, for instance, one can just replace a part of audio, while it can dramatically change the message and the meaning of the video. With the goal to develop an automated system that can detect these audio-visual speaker inconsistencies, we consider several approaches proposed for lip-syncing and dubbing detection, based on convolutional and recurrent networks and compare them with systems that are based on more traditional classifiers. We evaluated these methods on publicly available databases VidTIMIT, AMI, and GRID, for which we generated sets of tampered data.

Type

conference paper

DOI

10.23919/EUSIPCO.2018.8553270

Authors

Korshunov, Pavel

•

Marcel, Sébastien

Publication date

2018

Published in

2018 26th European Signal Processing Conference (EUSIPCO)

Start page

2375

End page

2379

Subjects

Benchmarking

lip-syncing

LSTM

Video tampering

URL