Machine learning based detection of digital documents maliciously recaptured from displays

Gholam-Zadeh, Saleh; Upenik, Evgeniy; Hatarsi, Guy; Ebrahimi, Touradj

doi:10.1117/12.2569256

conference paper

Machine learning based detection of digital documents maliciously recaptured from displays

Gholam-Zadeh, Saleh

•

Upenik, Evgeniy

•

Hatarsi, Guy

more

August 21, 2020

Proc. SPIE 11510, Applications of Digital Image Processing XLIII

SPIE Applications of Digital Image Processing XLIII

We used to say “seeing is believing": this is no longer true. The digitization is changing all aspects of life and business. One of the more noticeable impacts is in how business documents are being authored, exchanged and processed. Many documents such as passports and IDs are being at first created in paper form but are immediately scanned, digitized, and further processed in electronic form. Widely available photo editing software makes image manipulation quite literally a child's play increasing the number of forged contents tremendously. With the growing concerns over authenticity and integrity of scanned and image-based documents such as passports and IDs, it is more than urgent to be able to quickly validate scanned and photographic documents. The same machine learning that is behind some of the most successful content manipulation solutions can also be used as a counter measure to detect them. In this paper, we describe an efficient recaptured digital document detection based on machine learning. The core of the system is composed of a binary classification approach based on support vector machine (SVM), properly trained with authentic and recaptured digital passports. The detector informs when it encounters a digital document that is the result of photographic capture of another digital document displayed on an LCD monitor. To assess the proposed detector, a specific dataset of authentic and recaptured passports with a number of different cameras was created. Several experiments were set up to assess the overall performance of the detector as well as its efficacy for special situations, such as when the machine learning engine is trained on a specific type of camera or when it encounters a new type of camera for which it was not trained. Results show that the performance of the detector remains above 90 percent accuracy for the large majority of cases.

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/173394

Name

Machine_Learning_Based_Detection_of_Digital_Documents_Maliciously_Recaptured_from_Displays.pdf

Type

Postprint

Access type

openaccess

License Condition

Copyright

Size

8.91 MB

Format

Adobe PDF

Checksum (MD5)

ce342a4737c62e2dcd6c20509bdda27d