New Genome Similarity Measures based on Conserved Gene Adjacencies

Doerr, Daniel; Kowada, Luis Antonio B.; Araujo, Eloi; Deshpande, Shachi; Dantas, Simone; Moret, Bernard M. E.; Stoye, Jens

doi:10.1089/cmb.2017.0065

Doerr, Daniel; Kowada, Luis Antonio B.; Araujo, Eloi; Deshpande, Shachi; Dantas, Simone; Moret, Bernard M. E.; Stoye, Jens

2017

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

Many important questions in molecular biology, evolution, and biomedicine can be addressed by comparative genomic approaches. One of the basic tasks when comparing genomes is the definition of measures of similarity (or dissimilarity) between two genomes, for example, to elucidate the phylogenetic relationships between species. The power of different genome comparison methods varies with the underlying formal model of a genome. The simplest models impose the strong restriction that each genome under study must contain the same genes, each in exactly one copy. More realistic models allow several copies of a gene in a genome. One speaks of gene families, and comparative genomic methods that allow this kind of input are called gene family-based. The most powerfulbut also most complexmodels avoid this preprocessing of the input data and instead integrate the family assignment within the comparative analysis. Such methods are called gene family-free. In this article, we study an intermediate approach between family-based and family-free genomic similarity measures. Introducing this simpler model, called gene connections, we focus on the combinatorial aspects of gene family-free genome comparison. While in most cases, the computational costs to the general family-free case are the same, we also find an instance where the gene connections model has lower complexity. Within the gene connections model, we define three variants of genomic similarity measures that have different expression powers. We give polynomial-time algorithms for two of them, while we show NP-hardness for the third, most powerful one. We also generalize the measures and algorithms to make them more robust against recent local disruptions in gene order. Our theoretical findings are supported by experimental results, proving the applicability and performance of our newly defined similarity measures.

Details

Title New Genome Similarity Measures based on Conserved Gene Adjacencies

Author(s) Doerr, Daniel ; Kowada, Luis Antonio B. ; Araujo, Eloi ; Deshpande, Shachi ; Dantas, Simone ; Moret, Bernard M. E. ; Stoye, Jens

Published in Journal Of Computational Biology

Pagination 19

Volume 24

Issue 6

Pages 616-634

Date 2017

Publisher New Rochelle, Mary Ann Liebert

ISSN 1066-5277

Keywords

family-free genome comparison; gene connections; genome rearrangements; genome similarity measure; conserved adjacencies

DOI https://doi.org/10.1089/cmb.2017.0065

Other identifier(s) View record in Web of Science

Laboratories LCBB

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IC Archives > LCBB - Laboratory for Computational Biology and Bioinformatics
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2017-07-10

Abstract

Details

Actions