Representatives (queries) from each of structural families containing
structural similarities despite no sequence similarity
[Russell \& Barton, 1994] were chosen to assess the
method. The
queries are shown in Table 4 and represent a
diversity of folds from all four protein folding classes. For all
queries, there is at least one clear example of a similar fold in the
database that does not show any detectable sequence similarity to the
query. For reference, similar folds in the database were
found by the STAMP (structural alignment of multiple proteins)
structure comparison program [Russell \& Barton, 1992] and with reference to the
structural classification of proteins (SCOP) database
[Murzin et al., 1995].
Two patterns were defined for each of the eleven structures: a) one
taken directly from the DSSP secondary structure assignment and
accessibility (i.e. perfect prediction) and b) one from
cross-validated secondary structure and accessibility prediction by
the methods of Rost &Sander [Rost \& Sander, 1994][Rost \& Sander, 1993]. The PHD program
and jack-knifed neural network architectures were kindly provided by
Dr Burkhard Rost (EMBL).
Experimental secondary structure summaries and accessibilities (a)
were taken from DSSP [Kabsch \& Sander, 1983].
Predicted secondary structure summaries (b) were taken from the `PHD sec'
entries and accessibilities from the `SUB acc' entries, since these
most closely resembled the assignments from the calculation
of accessibility. PHD assignments of buried and exposed states
were classified as buried and exposed, with all other positions `i' or
no assignment as `u'. Strands shorter than
two residues, and helices shorter than four residues were ignored. The
length of the secondary structure was given by the number of residues
in each secondary structure (maximum = minimum), and the number of
residues between the secondary structures was taken as the minimum
loop length.
Patterns may also contain distance restraints, such as those available from
NMR experiments, disulphide linkages, or SDM studies. Distance restraints
were only added in the von-Willebrand factor and Proteasome patterns (see
Results).