Next: Predicting overall alignment Up: No Title Previous: Finding second and

Evaluation of alignment accuracy

What is a good alignment? The amino acid sequence codes for the protein three dimensional structure. Accordingly, when an alignment of two or more sequences is made, the implication is that the equivalenced resdues are preforming similar structural roles in the native folded protein. The best judge of alignment accuracy is thus obtained by comparing alignments resulting from sequence comparison with those derived from protein three dimensional structures. There are now many families of proteins for which two or more members have been determined to atomic resolution by X-ray crystallography or NMR. Accurate alignment of these proteins by consideration of their tertiary structures [36][35][34] provides a set of test alignments against which to compare sequence-only alignment methods. Care must be taken when performing the comparison since within protein families, some regions show greater similairty than others. For example, the core - strands and - helices are normally well conserved, but surface loops vary in structure and alignments in these regions may be ambiguous, or if the three-dimensional structures are very different in a region, alignment may be meaningless. Accordingly, evaluation of alignment accuracy is best concentrated on the core secondary structures of the protein and other conserved features [37]; such regions may automatically be identified by the algorithm of Russell and Barton [36].

geoff.barton@ox.ac.uk