Upenik, EvgeniyTestolina, MichelaAscenso, JoaoPereira, FernandoEbrahimi, Touradj2021-10-222021-10-222021-10-22202110.1109/VCIP53242.2021.9675314https://infoscience.epfl.ch/handle/20.500.14299/182389Learning-based image codecs produce different compression artifacts, when compared to the blocking and blurring degradation introduced by conventional image codecs, such as JPEG, JPEG~2000 and HEIC. In this paper, a crowdsourcing based subjective quality evaluation procedure was used to benchmark a representative set of end-to-end deep learning-based image codecs submitted to the MMSP'2020 Grand Challenge on Learning-Based Image Coding and the JPEG AI Call for Evidence. For the first time, a double stimulus methodology with a continuous quality scale was applied to evaluate this type of image codecs. The subjective experiment is one of the largest ever reported including more than 240 pair-comparisons evaluated by 118 naïve subjects. The results of the benchmarking of learning-based image coding solutions against conventional codecs are organized in a dataset of differential mean opinion scores along with the stimuli and made publicly available.deep learningimage codinglearning-based compressionsubjective evaluationvisual qualitycrowdsourcingLarge-Scale Crowdsourcing Subjective Quality Evaluation of Learning-Based Image Codingtext::conference output::conference proceedings::conference paper