A generalized profile syntax for biomolecular sequence motifs and its function in automatic sequence interpretation

A general syntax for expressing biomolecular sequence motifs is described, which will be used in future releases of the PROSITE data bank and in a similar collection of nucleic acid sequence motifs currently under development. The central part of the syntax is a regular structure which can be viewed as a generalization of the profiles introduced by Gribskov and coworkers. Accessory features implement specific motif search strategies and provide information helpful for the interpretation of predicted matches. Two contrasting examples, representing E. coli promoters and SH3 domains respectively, are shown to demonstrate the versatility of the syntax, and its compatibility with diverse motif search methods. It is argued, that a comprehensive machine-readable motif collection based on the new syntax, in conjunction with a standard search program, can serve as a general-purpose sequence interpretation and function prediction tool.


Publié dans:
Proc Int Conf Intell Syst Mol Biol, 2, 53-61
Année
1994
Note:
Swiss Institute for Experimental Cancer Research, Epalinges s/Lausanne, Switzerland.
Laboratoires:




 Notice créée le 2007-12-17, modifiée le 2018-12-03


Évaluer ce document:

Rate this document:
1
2
3
 
(Pas encore évalué)