3Dee - Database of Protein Domain Definitions

[Search] [Help] [Barton Group]

Introduction

The number of protein three dimensional structures determined to atomic resolution by experimental techniques continues to grow rapidly. However, there is a large amount of redundant information in the Protein Data Bank (PDB), since many of the proteins share sequence or structural similarity. The classification of proteins is thus an important step, both for producing representative sets of structures and for examining structural families.

Classification is most simply performed by taking the protein chain as the basic unit. However, it has long been recognised that classification should really be performed on structural domains rather than chains since a protein may contain more than one domain. For example, two proteins, each with two domains, may share only one domain in common. In addition, a protein domain may be composed of more than one chain.

Classification of domains is complex. Although structural domains are often thought of as compact, semi-independent units (Richardson, 1981), there is no binding definition of a domain. There are many proteins for which domains are difficult to define, or the domain definitions are ambiguous. In 3Dee, the program DOMAK (Siddiqui & Barton, 1995) was used initially to create the domain definitions which were then inspected and corrected where necessary. Care has been taken in the 3Dee database to ensure that domain definitions are consistent within sequence families.

References

Richardson, J. S. (1981). The anatomy and taxonomy of protein structure. Advances in Protein Chemistry 34, 246.

Siddiqui, A. S. & Barton, G. J. (1995). Continuous and discontinuous domains: an algorithm for the automatic generation of reliable protein domain definitions. Protein Science 4, 872-884.

Other classifications and domains databases


3dee@compbio.dundee.ac.uk