Next: Program limits
Up: No Title
Previous: Flexible Pattern Matching
This section describes how to use AMPS to perform flexible pattern matching by
the method of Barton and Sternberg (1990). This approach can help identify
weaker similarities between proteins that would otherwise be missed by
conventional sequence comparison methods. AMPS may also be used to scan the
database with a multiple sequence alignment using the Needleman-Wunsch
algorithm - normally this is not as effective as deriving a pattern and
scanning with that.
PLEASE NOTE: It is important that you are familiar with the AMPS alignment
features described above, before attempting flexible pattern matching.
AMPS allows the following operations to be performed:
- Definition of a pattern representative of a particular protein fold
including the explicit description of allowed flexibility in gap length between
defined regions.
- A variety of scoring systems for each element of the pattern. eg. based on
frequency weights, Dayhoff's matrix, conservation or fully user defined
weights.
- Scanning of the pattern against a database of protein sequences, subsequennt
rank ordering and display of the results of the scan.
- Detailed analysis of a single sequence for the presence of multiple
occurences of the pattern. Calculation of the significance of the best
matching pattern by reference to randomized sequences.
These points will be illustrated by reference to examples.
A typical pattern analysis might follow the following steps:
- Define a pattern and choose a scoring scheme.
- Scan the pattern against the database (PROGRAM MULTALIGN).
- Sort the results (PROGRAM SORTER).
- Get ID's of interesting proteins (PROGRAM SORTER).
- Extract sequences of interesting proteins from the database (PROGRAM
SELECT).
- Align pattern to the proteins - including alternative alignments
(PROGRAM MULTALIGN).
- Produce compressed output for inspection (PROGRAM PATT)
Not every stage need be performed. For example, if we already know the subset
of the protein database that is interesting, then steps 1-4 can be avoided.
Next: Program limits
Up: No Title
Previous: Flexible Pattern Matching