A Novel Replica Detection System using Binary Classifiers, R-trees, and PCA
Replica detection is a prerequisite for the discovery of copyright infringement and detection of illicit content. For this purpose, contentbased systems can be an efficient alternative to watermarking. Rather than imperceptibly embedding a signal, content-based systems rely on image similarity. Certain content-based systems use adaptive classifiers to detect replicas. In such systems, a suspect image is tested against every original, which can become computationally prohibitive as the number of original images grows. In this paper, we propose using R-tree indexing to decrease the necessary number of comparisons and rapidly select the most likely originals. Experimental results show that the proposed system performs very satisfactorily and that up to 99.7% of the originals can be discarded before applying the binary classifiers.