Résumé

A flexible motif search technique is presented which has two major components: (1) a generalized profile syntax serving as a motif definition languaje; and (2) a motif search method specifically adapted to the problem of finding multiple instances of a motif in the same sequence.The new profile structure, which is the core of the generalized profile syntax, combines the functions of a variety of motif descriptors implemented in other methods, including regular expression-like patterns, weight matrices, previously used profiles, and certain types of hidden Markov models (HMMs). The relationship between generalized profiles and other biomolecular motif descriptors is analyzed in detail, with special attention to HMMs. Generalized profiles are shown to be equivalent to a particular class of HMMs, and conversion procedures in both directions are given. The conversion procedures provide an interpretation for local alignement in the framework of stochastic models, allowing for clear, simple significance test. A mathematical statement of the motif search problems defines the new method exactly without linking it to a specific algorithmic solution. Part of the definition includes a new definition of disjointness of alignements.

Détails

Actions