[Search] [Help] [Barton Group]

3Dee - Database of Protein Domain Definitions


Help on Using 3Dee


Overall Structure of the Database

For a complete description of the 3Dee database and how it was derived and is maintained, please see (m/s in preparation). These notes are intended to guide you in the use of the on-line database to access information on protein structural domains.

Different Types of Pages

There are four different types of page in the database. The chain pages give a chain by chain description of the domains in the database. The domain sequence family pages contain lists of all domains that share sequence similarity. A single representative member from each domain sequence family is used to define two structure classification hierarchies the 3Dee Hierarchy and the SCOP-like Hierarchy.

The Chain Page

A unique chain page exists for each protein and each chain of each protein in the database. For example, the Immunoglobulin structure 2fb4 has a chain page 2fb4 that contains information on both chains in the structure. In addition, there are separate chain pages for 2fb4l and 2fb4h which only contain information on the respective chains.

A chain page consists of a header, followed by the domain definitions. In common with the other pages there is a search field at the bottom of the page.

Header

The following information is listed after the banner. The information in the chain page header is extracted from the PDB file header. There is also a link to view the header of the PDB file and a link to SCOP, denoted by . The SCOP link connects to the SCOP search engine for the PDB code of the protein being viewed.

Domain Definitions

After the header the domain definitions are listed. If the chain page contains the domain definitions of more than one chain, the domain definitions are listed for each chain in turn. The chains are presented in the order that they appear in the PDB file.

The default domain definition comes first, followed by any "equally valid" domain definitions (if present).

The domains are listed in the order they are encountered as one follows the chain from the N to C terminus. If the domain comprises more than one segment, the order of the domains is given by the segment which occurs first. Also multi-segment domains have their segments listed in the order they occur from N to C terminus.

Note that for multi-chain domains the definitions are duplicated for each chain. For example, if the multi-chain domain contains chains A and C, the domain definitions of chain A and chain C will be identical.

Each segment is labelled with a start residue "FROM" and an end residue "TO". The residues are labelled in the following way - a:b:c

For example, C:34:- refers to residue 34 of chain C, whilst -:34:A refers to residue 34 of the unlabelled chain with insertion code "A".

Each domain is also tagged with a Class and Fold. The source of this information is given as either 3Dee or SCOP. In situations where both occur, the 3Dee definition takes precedence over the SCOP definition.

Clicking on "Enter Classification" will display the Domain Sequence Family page from which the structural hierarchies can be accessed.

Rasmol Interface to 3Dee

Clicking on "View Domains" will load the protein into RASMOL (provided you have installed the interface). The domains are coloured in the following order (corresponding to the order in which they are listed in the chain page).
  1. red
  2. green
  3. blue
  4. yellow
  5. purple
  6. cyan
  7. magenta
  8. greenblue
  9. violet
  10. redorange
Clicking on "rasmol" under "See Domain" will highlight only that domain. The domain is coloured by the "group" option in RASMOL. This shows the protein coloured blue to red from N to C terminus, making it easy to see how the chain runs through the domain. NB RASMOL cannot always display with this colouring method (usually if the protein contains lots of chains), in which case the whole domain appears in a single colour.

Alternative Domain Definitions

The link labelled "Show alternative domain definitions for this chain" takes you to a page of alternative domain definitions for the chain. These domain definitions, if present, are earlier domain definitions for the chain which have subsequently been corrected or modified in the current database. They are stored to maintain a history of the database and are listed in a similar way to the domain definitions on the chain pages, but without the Class and Fold names and without the link to the Domain Sequence Family.

The Domain Sequence Family

These pages contain lists of all domains that have clear sequence similarity to each other (and hence structural similarity). One domain is chosen to be the representative of the set and it is defined at the top of the page.

There are links to the bottom of the 3Dee and SCOP-like structural hierarchies for that domain. See the description of these hierarchies below for more information.

Next is the actual list of the domains with sequence similarity. Clicking on a domain name connects to the chain page where that domain is defined.

The 3Dee Hierarchy

The 3Dee hierarchy is constructed using the scores from a structural scoring scheme. Each page has the same format (although superpositions may only be viewed for clusters which score >= 3.0).

At the top of the page is the score at which the cluster was created. The score is derived from the structural scoring scheme. Means linkage clustering was used to produce the cluster. Unless the cluster is the top of the hierarchy, there is a link to the next level up in the hierarchy. The higher level will have a lower score and hence contain more domains.

The Class and Fold Tables

The first table contains a list of the classes of domains in the current cluster. The percentage breakdown by class of the domains in the cluster is given, together with the number of domains of each class as a percentage of the number in the representative set.

A separate table for each class present in the cluster comes next, listing, in alphabetical order, the fold names of the domains in the cluster. Again, each fold name is followed by a percentage breakdown by fold of the domains in the cluster, together with the number of domains of each fold as a percentage of the number in the representative set.

The fold name is tagged by its origin. Folds that are identified as a SCOP fold are labelled as such. The remainder originate from 3Dee. If the SCOP symbol is present, it connects directly to the appropriate SCOP fold page.

The Organisation of Sub-Clusters (same in both the 3Dee and SCOP-like hierarchies)

Starts with the line "This cluster is formed by the merging of the following clusters" and is followed by the set of sub-clusters that make up this cluster. Cluster 0, if it is present, always contains domains that sub-divide into single membered clusters. All other clusters sub-divide further and can be "exploded" by clicking on the appropriate link. Clicking on the domain name connects to the chain page. There is a link to the Domain Sequence Family Page of the domain as well. The class and fold name of the domain are listed if available.
Viewing Structural Superpositions (same in both the 3Dee and SCOP-like hierarchies)
Structural superpositions are only available from pages on which the cluster is >= 3.0. In this case, each domain name on the page is preceded by a checkbox and at the bottom of the page is the heading "View Superpositions of `checked' domains", followed by some buttons.

To enable this feature you must first install the superposition software. Select the domains you wish to view by clicking on the checkboxes, then click on the "View Superpositions" to bring up the superpositions of the selected domains. Clicking on the "Reset Form" button deselects all "checked" domains.

On clicking "View Superpositions" the transformation matrix relating the structures is downloaded. Matrix operations are carried out on the local PDB files to produce a combined PDB file. This is sent to the RASMOL program (if a RASMOL window does not exist, one is created). By default, a window giving information on the program status is displayed. Another window listing each of the domains, together with its colour as displayed by RASMOL, is also produced. Selecting and deselecting these domains, turns them on and off in RASMOL.

The SCOP-like Hierarchy

This hierarchy derives its name from the SCOP database as the two highest levels of classification are class and fold - the same as SCOP. In the SCOP-like hierarchy, all domains that have been given the same fold name are grouped together. This does not always happen in the 3Dee hierarchy, where it is common for similar folds with different names to be grouped together before all folds with the same name are grouped. In the SCOP-like hierarchy, all the domains with the same fold name are further classified according to their scores in the structural scoring scheme. In other words, the same scores are used to cluster the domains that are used in creating the 3Dee hierarchy, but only on domains with same fold name. There are three types of pages associated with the SCOP-like hierarchy.

The SCOP-like Top of Hierarchy Page

This page contains a list of the different classes in the 3Dee database. Also present are the number of domains which fall into each class in both the representative set and entire database. These numbers are also expressed as percentages. Clicking on a class connects to the corresponding class page.

The Class Page

There is one class page for each class. At the top of the page is a link to the SCOP-like "top of hierarchy" page followed by the class name itself and a table of folds. The folds are all of the class described and are presented in alphabetical order. Also present are the number of domains which are given each fold name, in both the representative set and entire database. These numbers are also expressed as percentages.

As with the 3Dee hierarchy, the fold names are tagged by their origin. Folds that are identified as a SCOP fold are labelled as such. The remainder originate from 3Dee. Clicking on a fold name connects to the corresponding fold page.

The Fold Page

At the top of the file is the score at which the cluster was created. The score is derived from the structural scoring scheme. Means linkage clustering was used to produce the cluster. If the cluster is the top level cluster, there is a link to the corresponding class page. Otherwise, there is a link to the next level up in the hierarchy. The higher level will have a lower score and hence contain more domains of the same fold.

The class and fold name are given next. As for the class pages, the fold name is tagged with the origin, though this time if the fold name has a link attached to it, it is connected to the appropriate page in SCOP.

Sub-Cluster Organisation - see corresponding heading in 3Dee Hierarchy

Viewing Structural Superpositions - see corresponding heading in 3Dee Hierarchy

3dee@compbio.dundee.ac.uk