next up previous contents
Next: Transformations Up: Input and Output format Previous: Input and Output format

Describing domain structures

Every entry in a STAMP input file is called a `domain'. This word is a bit of a misnomer, since these things needn't be single domains (though it is usually best to do structure comparisons at the domain level).

The problem of defining domains such that a wide variety of possibilities may be used (e.g. all the coordinates in a PDB file, one chain, bits of one chain, two chains, one chain and bits of another, etc) is solved by defining a domain by: 1) a file, 2) an identifier, and 3) a list of `objects', from the file, to be included in the domain. An object is defined as a run of ${\rm C}_{\alpha}$coordinates, and a domain may contain more than one object.

Domains are stored in STAMP in files which may contain one or more of such domain definitions.

The format of these files must be as follows:

<file name> <identifier label> { <objects> }


<file name> <identifier label> { <objects> [RETURN]
   R11 R12 R13   V1
   R21 R22 R23   V2
   R31 R32 R33   V3 }

<file name> is the full name (including path) of the PDB file in which the coordinate information is to be found. If you don't know the precise location of the file, then just call it UNK or something (i.e. not a blank), and the programs should be able to find the appropriate PDB file using the identifier (if one can be found on your system), e.g. /usr/people/jack/pdb4mbn.ent

<identifier label> is a short name to be used by the program. eg. 4mbn1

If secondary structures are to be found by the program, then the first four letters of the identifier label should be the PDB code should correspond to the prefix used in your PDB/DSSP naming system. There should be no duplication of these, to allow for self comparison. It should contain the brookhaven four letter code first and anything else afterwards.

<objects> are coordinate descriptions, and may be one of three types:

1.  ALL
all ${\rm C}_{\alpha}$'s from the file.

only ${\rm C}_{\alpha}$'s labeled as chain X.

3.  <chain1> <number1> <insert1> to <chain2> <number2> <insert2>
e.g.  B 20 _ to B 67 P
only ${\rm C}_{\alpha}$'s between (and including) the two full brookhaven
residue names (chain, number, insertion code; the `_` character denotes a space)

N.B. THERE MUST BE AT LEAST ONE SPACE BETWEEN THE VARIOUS FIELDS. Combinations of these are allowed within one domain, e.g. ` CHAIN A B 1 _ to B 65 _ `

R11 $\rightarrow$ R33 and V1 $\rightarrow$ V3 are a rotation matrix and translation vector, respectively.

Thus, a full description of three domains might look something like this:

/data/newpdb/pdb/pdb1ton.ent 1ton { ALL 
0.9876 0.34 0.543  19.23
1.0  2.34   0.98473332  1.0
0.023  0.94 4.345     20.0 }
/data/newpdb/pdb/pdb2kai.ent 2kai_Kallikrien { CHAIN X CHAIN Y }
/data/newpdb/pdb/pdb3sgb.ent 3sgbe_SGprotease { E 20 _ to E 160 P 
1.0 0.0 0.0   0.0
0.0 1.0 0.0   0.0
0.0 0.0 1.0   0.0 }

Note the spaces. There must be spaces separating each keyword or datum to be read, even between the braces. For example:

/data/newpdb/pdb/pdb3sgb.ent 3sgb_protease{E 20 _ to E 160P}

would not be allowed.

In the second domain (Kallikrien) the transformation will be set equal to the identity matrix with a translation of zero, since none has been supplied.

The domains must be listed at the start of a file (ie. nothing must come before them in a file), but anything may come afterwards, provided that it contains no braces (ie. { or }) unless they are on lines containing `%` in the first column.

It is possible to reverse the direction of an object in a domain description. For example, if one has two objects, one can reverse the direction of one or more of these by placing the word "REVERSE" in front of the object, e.g.:

/data/newpdb/pdb/pdb4mbn.ent { REVERSE _ 1 _ to _ 20 _ _ 21 _ to _ 120 _ }

next up previous contents
Next: Transformations Up: Input and Output format Previous: Input and Output format
Geoff Barton