A database of protein 3D structural domains was derived
from the Brookhaven Protein Databank [Bernstein et al., 1977].
non-identical chains were clustered by sequence comparison
[Barton, 1993][Smith \& Waterman, 1981] to leave
sequence families. One
representative of each family was chosen to have the highest resolution and lowest R-factor.
The representative structures were then split into
domains by eye.
A sub-database of higher quality domains was created for
analysis. This contained only
those structures determined by X-ray crystallography, refined and of
a resolution of
Å or better. Secondary structures for all
domains were defined by the programs DSSP (definition of secondary structure in proteins) [Kabsch \& Sander, 1983]
or by DEFINE [Richards \& Kundrot, 1988] when only
atoms were available. Axial coordinates were
calculated for all secondary structures as described in
[Richards \& Kundrot, 1988]. Extra axial coordinates were calculated at the N- and
C- terminal ends to allow for possible differences in secondary
structure length. The domain database is available via the WWW
(http://barton.ebi.ac.uk/).