Next: Acknowledgements Up: No Title Previous: Homologous vs. analogous

Discussion

The results of this study suggest that there is little in common between distantly related protein structures. The proportion of distantly related protein structures that are actually similar to each can be as little as of the maximum, which reinforces the observation that secondary structure lengths and loops in distantly related structures vary substantially. The degree to which accessibility and secondary structure are conserved on a residue by residue basis within structurally similar proteins can be as low as that for dissimilar proteins (i.e., by chance). The fraction of shared interactions (pairs of residues in contact in two distantly related protein structure) can be as little as , even when a lenient definition of - distance is used. Structurally similar proteins can have almost no common favourable interactions, or those contributing a negative pseudo-energy term. Finally, regardless of any functional similarity, similar protein 3D structures often have a proportion of complementary changes approaching that expected by chance. All the results suggest that proteins can adopt very similar folds by using almost completely different interactions, and that proteins having similar 3D structures can have little in common apart from a scaffold of common core secondary structures.

The results presented here have many implications for methods of protein fold detection. The fact that the degree of conservation of secondary structure and accessibility, when considered on a residue by residue basis, is similar to that for structurally dissimilar proteins, and the low proportion of residues in common cores suggests why many methods of fold detection are often unable to detect genuine 3D structural similarities. In particular, those methods that do not consider long-range interactions (i.e., side-chain to side-chain contacts), are unlikely to detect weak 3D structural similarities, since other residue by residue (i.e., one-dimensional) measures of structural similarity are not well conserved for many genuinely similar proteins.

Methods which thread protein sequences onto 3D structural templates using pair potentials [Bryant \& Lawrence, 1993][Godzik et al., 1993][Jones et al., 1992][Sippl et al., 1992], are likely to fare better, though all of these methods require that similar structures should have a reasonable proportion of interacting residues in common. The small fraction of residues common to the core of distantly related proteins (as few as ), and the even smaller fraction of common interacting residues (as few as ) suggests that many protein 3D structural similarities will be undetectable even by threading methods, since key interactions are likely to be modelled incorrectly. Our findings suggest that it is more general features of protein structure, such as having hydrophobic residues buried in the core of proteins, and polar residues on the surface, rather than particular residue-residue interactions that determine how well a particular sequence adopts a particular fold. If detection of similar folds having little in common outside of their core secondary structures is to become a reality, efforts should concentrate on such general principles, and on methods for modelling large loop regions that are likely to differ between similar 3D structures.

The results provide little insight as to whether structurally similar proteins have evolved by divergence or convergence. However, the fact that there is no detectable difference between pairs structures that are functionally similar and those that are not (at a similar ) suggests that it may be impossible to discern divergence from convergence. Those proteins which were defined as type similarities are often thought to have a common ancestor. For example, it seems very likely that the aspartic proteinase lobes (i.e., N- and C-terminal domains in the eukaryotic structures) are related both to each other (i.e., by gene duplication or exon shuffling; see Blundell et al., 1979) and to the single viral proteinase lobes which dimerise to form a similar structure (e.g., Lapatto et al., 1989). However, their degree of structural and sequence conservation is low. If one argues that the proteinase lobes are related by divergence, then, based on the degree of structural and sequence similarity, one could argue the same for the quite obviously functionally dissimilar plastocyanin and Ig light chain variable domain shown in Figure 1. It would seem that both the sequence and structure of similar proteins can evolve beyond recognition even when function is conserved.

Next: Acknowledgements Up: No Title Previous: Homologous vs. analogous

gjb@
Thu Feb 9 18:06:48 GMT 1995