It is illustrative to consider the fraction of maps removed by each of
the filters described above. For example, a pattern derived from a DSSP
assignment of secondary structure for thioredoxin that allows for 2
secondary structure element deletions from the query and 5 from the
database, the initial alignment of secondary structure elements
reduces the number of folds from .
folds
have no match of secondary structures with the predicted
thioredoxin pattern. Table 1 illustrates the fractions of the initial
maps within
folds that are removed by each filter when
applied independently. Table 2 shows for the same example, how the
number of maps drops as the filters are applied in succession. The
filters are independent of one another apart from consistency filtering,
which must be applied after loop and distance restraint filtering,
and redundancy filtering, which must be applied last. The order of
filters shown in Table 1 was chosen so as to optomise speed.
The gradual elimination of maps and folds shows how the simple
principles of protein structure are sufficient to reduce the
number of possible alignments by two orders of magnitude. Interestingly,
the number of folds drops very little after the generation of
maps, suggesting that the filters are tending mostly to remove
nonsensical maps associated with each identified fold rather
than ruling out folds. Note that
consistency filtering tends only to remove maps when tight loop
lengths or distance restraints are included in the pattern.