Toward a Dynamic Threshold for Quality-Score Distortion in Reference-Based Alignment

Hernandez-Lopez, Ana A.; Alberti, C.; Mattavelli, M.

doi:10.1089/cmb.2019.0333

Hernandez-Lopez, Ana A.; Alberti, C.; Mattavelli, M.

2019

Télécharger

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Fichiers

Résumé

The intrinsic high entropy metadata, known as quality scores, are largely the cause of the substantial size of sequence data files. Yet, there is no consensus on a viable reduction of the resolution of the quality score scale, arguably because of collateral side effects. In this paper we leverage on the penalty functions of HISAT2 aligner to rebin the quality score scale in such a way as to avoid any impact on sequence alignment, identifying alongside a distortion threshold. We tested our findings on whole-genome sequence and RNA sequence data, and contrasted the results with three methods for lossy distortion of the quality scores.

Détails

Titre Toward a Dynamic Threshold for Quality-Score Distortion in Reference-Based Alignment

Auteur(s) Hernandez-Lopez, Ana A. ; Alberti, C. ; Mattavelli, M.

Publié dans Journal of Computational Biology

Volume 27

Numéro 2

Pages 288-300

Présenté à 15th International Symposium on Bioinformatics Research and Applications (ISBRA), Barcelona, Spain, June 3–6, 2019

Date 2019

Mots-clés (libres)

Quality scores; Reference-based alignment; Quality score distortion; HISAT2; Lossy compression

DOI https://doi.org/10.1089/cmb.2019.0333

Laboratoires SCI-STI-MM

Le document apparaît dans Production scientifique et compétences > STI - Faculté des sciences et techniques de l'ingénieur > IEM - Institute of Electrical and Micro Engineering > SCI-STI-MM - Groupe SCI STI MM
Publications validées par des pairs
Papiers de conférence
Travail produit à l'EPFL

Date de création de la notice 2019-06-17

Actions

Aperçu

Sélectionner le fichier :