Bork &Margolis (1995) recently identified a new phosphotyrosine
interaction domain (PID) involved in the cytoplasmic signalling
cascade. They constructed an alignment of several diverse members of
this sequence family, and performed a prediction of secondary
structure. We ran the PHD program on a slightly more up-to-date
alignment of PID proteins (P. Bork, personal communication), to
predict the secondary structure and accessibility. A search pattern
was made from the prediction, and the loop length
ranges taken from the multiple alignment. The pattern of 9 secondary
structures was BBHBBBBBH and these elements are numbered sequentially
from 1-9 below. Since there were two long loops connecting the
predicted secondary structures, the adjacent parallel filter was not
used during the search. Structures corresponding to the best
alignment with each of the top six scoring folds are shown in Figure 3. Recent structure
determination has shown the PID (PTB domain) to resemble the plekstrin homology
(PH) domain
in structure and function
[Zhou et al., 1995]. By the accessibility scoring scheme, the top ranked fold
is not a PH domain, although a PH domain (from
dynamin) is ranked at position 2. The top 6 folds are illustrative in
that they show how the method can suggest alternative plausible folds
that satisfy
a pattern of predicted secondary structures and accessibilities.
The best scoring fold (Figure 3a) is that of profilin (PDB code
2BFPP), and the best scoring map gives an anti-parallel sheet
with the strand order 218754 (predicted strand 6 is deleted) with
one helix packing against each face. The second best scoring fold is
a correct match with the PH domain from human dynamin (1DYNB), having
deleted the first predicted
helix from the PID pattern. The third
best scoring fold (3c) comes from S. aureus
lactamase (1BLH,
domain 1), with an anti-parallel
sheet of order 54876, with both
helices packing against one face. The fourth and fifth best scoring
folds come from members of the Ig superfamily, and comprise
alternative arrangements of
strands to form a greek key
sandwich. Both of the predicted
helices from the PID pattern
have been deleted in these matches. Finally, the sixth (3e) match
comes from the tryptic core of E. coli lac repressor (1TLFD
domain 4), and comprises a parallel
sheet (42576) with both
helices packing against one face. This fold is perhaps the least
plausible, since it would require 3 crossover connections between
adjacent and parallel
strands.
The method has suggested plausible alternative structures
that can be scrutinised, in the absence of 3D structural
data, by way of further experiments, secondary structure predictions,
or even other methods of fold recognition. The results show how the
predicted secondary structure elements can be accommodated into a
compact, plausible protein fold, and encouragingly, the method has
identified the correct fold high in the list of alternatives.