A strategy is presented for protein fold recognition from secondary
structure assignments ( -helix and
-strand). The method can detect
similarities between protein folds in the absence of sequence
similarity. Secondary structure mapping first
identifies all possible matches (maps) between a query string of secondary
structures and the secondary structures of protein domains of known
three-dimensional structure. The maps are then passed through a
series of structural filters to remove those that do not obey
simple rules of protein structure. The surviving maps are ranked
by scores from the alignment of predicted and experimental
accessibilities. Searches made with secondary structure assignments
for a test set of eleven fold-families put the correct
sequence-dissimilar fold in the first rank 8/11 times. With
cross-validated predictions of secondary structure this drops to 4/11
which compares favourably with the widely used THREADER program
(1/11). The structural class is correctly predicted 10/11 times by
the method in contrast to 5/11 for THREADER. The new technique
obtains comparable accuracy in the alignment of amino acid residues
and secondary structure elements. Searches are also performed with
published secondary structure predictions for the von-Willebrand
factor type A domain, the proteasome 20S
subunit and the
phosphotyrosine interaction domain. These searches demonstrate how
the method can find the correct fold for a protein from a carefully
constructed secondary structure prediction, multiple sequence
alignment and distance restraints. Scans with experimentally
determined secondary structures and accessibility, recognise
the correct fold with high alignment
accuracy (86%on secondary structures). This suggests that the
accuracy of mapping will improve alongside any improvements in the
prediction of secondary structure or accessibility. Application to
NMR structure determination is also discussed.