Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Probabilistic base calling of Solexa sequencing data
 
research article

Probabilistic base calling of Solexa sequencing data

Rougemont, Jacques
•
Amzallag, Arnaud
•
Iseli, Christian
Show more
2008
BMC bioinformatics

BACKGROUND: Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA colonies. The processing and statistical analysis of such high-throughput data poses new challenges; currently a fair proportion of the tags are routinely discarded due to an inability to match them to a reference sequence, thereby reducing the effective throughput of the technology. RESULTS: We propose a novel base calling algorithm using model-based clustering and probability theory to identify ambiguous bases and code them with IUPAC symbols. We also select optimal sub-tags using a score based on information content to remove uncertain bases towards the ends of the reads. CONCLUSION: We show that the method improves genome coverage and number of usable tags as compared with Solexa's data processing pipeline by an average of 15%. An R package is provided which allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots.

  • Details
  • Metrics
Type
research article
DOI
10.1186/1471-2105-9-431
Web of Science ID

WOS:000260490200001

Author(s)
Rougemont, Jacques
Amzallag, Arnaud
Iseli, Christian
Farinelli, Laurent
Xenarios, Ioannis
Naef, Felix  
Date Issued

2008

Publisher

BioMed Central

Published in
BMC bioinformatics
Volume

9

Start page

431

Subjects

Software

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
UPNAE  
Available on Infoscience
November 1, 2010
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/56539
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés