Each of the aligned pairs of structures was considered separately. Within an aligned
pair of structures (i.e., proteins 1 and 2) two pairs of residues are defined:
and
(protein 1)
and
(protein 2)
In the alignment, position is aligned with position
and
position
is aligned with position
. Figure 3 illustrates these
definitions for a pair of simple 3D structures.
All pairs of interacting residues were required to have more than
four residues between them on the sequence (i.e.,
).
All possible (
,
),(
,
) combinations were considered as to
whether:
1. the positions are in contact in one or both structures,
2. if in contact whether or not the interaction is favourable
(i.e., is and/or
), or
3. if both structures are in contact at these positions, whether the interactions are similar.
Tests 1 and 2 provide information about the general nature of interactions within protein structures when considered individually. For example, the data may show how the number of favourable interactions behaves as a function of sequence length. Test 3 provides information as to how different sequences (often with little or no apparent sequence similarity) adopt similar three-dimensional structures, and thus requires further description.
When comparing positions found to be in contact at two aligned positions there
are three possibilities: 1) both interactions can be favourable; 2) one interaction can be favourable
(and the other unfavourable); and 3) both interactions can be unfavourable.
Situations where both interactions are favourable (i.e., when and
),
can be sub-divided by considering the intermediates involved in mutating one interaction to the other.
In other words, if one were to mutate, in two steps, the interaction
to
, two mutations would be involved, and the two possible
evolutionary paths would be:
Of course, such evolutionary paths are hypothetical, since the mutation of one pair
of residues to another may involve more than one intermediate.
However, considering both hypothetical paths enables all shared favourable interactions to
be sub-divided by considering the stability of the
intermediates (e.g., and
). There are three possible
situations:
1. Highly similar interactions, defined as those positions with both intermediates
having a favourable pseudo-energy (i.e., and
).
2. Partly similar interactions, defined as those positions having one favourable
intermediate (i.e., either or
).
3. Complementary changes, defined as those positions with both intermediates having
an unfavourable pseudo-energy (i.e., and
).
Type 1 describes interactions of a similar character in both structures, and thus suggest features common to the two structures (and perhaps to the fold in general). Type 2 describes less similar interactions, suggesting those positions on the point of diverging away from each other. Pairs of interactions of type 3 are the most interesting, since they are interactions of significantly different character in the two structures, yet which both contribute to the respective stabilities.
By considering the abundancies of the amino acids,
it is possible to determine the expected frequency the interactions described above.
All unique possible combinations of two
residue pairs involving
amino acids (excluding glycine, which can make no side-chain to
side-chain contacts) were classified by the above definitions. The results are shown in
Table 4. The weighted frequencies provide the expected limits for the types of
mutated pairs of interacting residues. Given a hypothetical situation where two
proteins having a similar 3D structure were known to be related convergently (i.e.,
that their most recent common ancestor had a different 3D structure), then one
would expect
of the total possible number of shared interactions to be favourable;
this can be further sub-divided into
highly similar,
partly similar
and
complementary changes.