Toward a Dynamic Threshold for Quality-Score Distortion in Reference-Based Alignment

The intrinsic high entropy metadata, known as quality scores, are largely the cause of the substantial size of sequence data files. Yet, there is no consensus on a viable reduction of the resolution of the quality score scale, arguably because of collateral side effects. In this paper we leverage on the penalty functions of HISAT2 aligner to rebin the quality score scale in such a way as to avoid any impact on sequence alignment, identifying alongside a distortion threshold. We tested our findings on whole-genome sequence and RNA sequence data, and contrasted the results with three methods for lossy distortion of the quality scores.


Presented at:
15th International Symposium on Bioinformatics Research and Applications (ISBRA), Barcelona, Spain, June 3–6, 2019
Year:
2019
Keywords:
Laboratories:




 Record created 2019-06-17, last modified 2019-08-12

Fulltext:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)