It is important to know in advance what the likely accuracy of an alignment will be. A common method for assessing the significance of a global alignment score is to compare the score to the distribution of scores for alignment of random sequences of the same length and composition. The result (the S.D. score) is normally expressed in Standard Deviation units above the mean of the distribution.
Comparison of the S.D. score for alignment to alignment accuracy
obtained by comparison of the core secondary structures, suggests that
for proteins of 100-200 amino acids in length, a score above 15.0
S.D. indicates a near ideal alignment, scores above 5.0 S.D. a
``good'' alignment where %of the residues in core secondary
structures will be correctly equivalenced, while alignments with
scores below 5.0 S.D. should be treated with caution [38][37].
Figure 2shows the distribution of S.D. scores for 100,000 optimal
alignments of length between proteins of unrelated three
dimensional structure. From Figure 2, the mean S.D. score expected
for the comparison of unrelated protein sequences is 3.2 S.D. with a
S.D. of 0.9. However, the distribution is skewed with a tail of high
S.D. scores. In any large collection of alignments it is possible to
have a rare, high scoring alignment that actually shares no structural
similarity. For example, Figure 3 illustrates an optimal local
alignment between regions of citrate synthase (2cts) and transthyritin (2paba)
which gives 7.55 S.D. though the secondary structure of these two protein
segments are completely different.