Modeling sequencing errors by combining Hidden Markov models

Lottaz, C.; Iseli, C.; Jongeneel, C. V.; Bucher, P.

doi:10.1093/bioinformatics/btg1067

research article

Modeling sequencing errors by combining Hidden Markov models

Lottaz, C.

•

Iseli, C.

•

Jongeneel, C. V.

more

2003

Bioinformatics

Among the largest resources for biological sequence data is the large amount of expressed sequence tags (ESTs) available in public and proprietary databases. ESTs provide information on transcripts but for technical reasons they often contain sequencing errors. Therefore, when analyzing EST sequences computationally, such errors must be taken into account. Earlier attempts to model error prone coding regions have shown good performance in detecting and predicting these while correcting sequencing errors using codon usage frequencies. In the research presented here, we improve the detection of translation start and stop sites by integrating a more complex mRNA model with codon usage bias based error correction into one hidden Markov model (HMM), thus generalizing this error correction approach to more complex HMMs. We show that our method maintains the performance in detecting coding sequences.

Type

research article

DOI

10.1093/bioinformatics/btg1067

Authors

Lottaz, C.

•

Iseli, C.

•

Jongeneel, C. V.

•

Bucher, P.

Publication date

2003

Published in

Bioinformatics

Volume

19 Suppl 2

Start page

ii103

End page

ii112

Note

Swiss Institute of Bioinformatics, Switzerland. Claudio.Lottaz@molgen.mpg.de

Peer reviewed

REVIEWED

EPFL units

GR-BUCHER

Available on Infoscience

December 17, 2007

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/15729