A sequence similarity search algorithm based on a probabilistic interpretation of an alignment scoring system
We present a probabilistic interpretation of local sequence alignment methods where the alignment scoring system (ASS) plays the role of a stochastic process defining a probability distribution over all sequence pairs. An explicit algorithms is given to compute the probability of two sequences given and ASS. Based on this definition, a modified version of the Smith-Waterman local similarity search algorithm has been devised, which assesses sequence relationships by log likelihood ratios. When tested on classical examples such as globins or G-protein-coupled receptors, the new method proved to be up to an order of magnitude more sensitive than the native Smith-Waterman algorithm.