Accessibilities for residues within each map are calculated quickly by
exploting the relationship between relative accessibility and the
number of other atoms within
Å (
) of a
residues
atom.
is calculated by considering
secondary structures and the C-terminal coils for the matched
structures. Analysis of the high quality domains shows that helical
residues are buried (b) when
, exposed (e) when
and intermediate/unknown (u) otherwise. Similarly,
residues in
strands are b when
, e when
and u otherwise. In the examples presented here,
predicted accessibilities were taken from the SUB line within PHD
[Rost \& Sander, 1994] output, which highlights those regions predicted with
confidence. Remaining positions were assigned as unknown (u) accessibility.
Given assignments of accessibility, the best alignment for each pair
of secondary structures not permitting gaps within either secondary
structure is found by applying the scoring matrix shown in Table 3.
These values were chosen to prevent long overhanging gaps in
the alignment of predicted and experimental secondary structures,
and designed not to penalise mismatches too heavily.
The total similarity score for the alignment is then defined as:
where is the best score for a pair of matched secondary
structures calculated by summing values from Table 3,
is the
number of matched secondary structures, and
is the total
difference in the lengths of the two protein domains being compared.
When calculating
those secondary structures that have been
equivalenced are ignored, since overhanging gaps are already penalised
by the gap score in Table 3.