Toward a Dynamic Threshold for Quality-Score Distortion in Reference-Based Alignment

Hernandez-Lopez, Ana A.; Alberti, C.; Mattavelli, M.

doi:10.1089/cmb.2019.0333

conference paper

Toward a Dynamic Threshold for Quality-Score Distortion in Reference-Based Alignment

Hernandez-Lopez, Ana A.

•

Alberti, C.

•

Mattavelli, M.

2019

Journal of Computational Biology

15th International Symposium on Bioinformatics Research and Applications (ISBRA)

The intrinsic high entropy metadata, known as quality scores, are largely the cause of the substantial size of sequence data files. Yet, there is no consensus on a viable reduction of the resolution of the quality score scale, arguably because of collateral side effects. In this paper we leverage on the penalty functions of HISAT2 aligner to rebin the quality score scale in such a way as to avoid any impact on sequence alignment, identifying alongside a distortion threshold. We tested our findings on whole-genome sequence and RNA sequence data, and contrasted the results with three methods for lossy distortion of the quality scores.

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/156831

Name

QS-distortion-threshold_ISBRA2019.pdf

Access type

openaccess

Size

1.58 MB

Format

Adobe PDF

Checksum (MD5)

2f73c9219af1c8c0334179082c2af0c3