The TOPOL system for reasoning about protein topology in Prolog [4] makes use of the secondary structure definitions as deposited in the Brookhaven Protein Data Bank. In order that the TOPOL rules may be applied to Kabsch and Sander derived secondary structure definitions and to allow access to the angle and accessibility information, several rules were developed.
Kabsch and Sander helix definitions allow a residue to belong to more than one
type of helix. Thus, it often occurs that a region of - helix will overlap
with a region of
- helix at the C-terminal end of the
- helix. TOPOL
expects residues to belong to only one secondary structure. Accordingly, where
overlaps occur, the helix definitions are compressed into a single thelix
definition. For example:
thelix(186,191,1fb4l,[ helix(189,191,1fb4l,three_ten), helix(186,191,1fb4l,alpha) ])
shows the starting and ending residue numbers for the concatenated helix, the chain identifier, then a list of the helix definitions that overlap.
Strands are simply identified by their start and end residue numbers, rather than a list of all residues in the strand. In addition, regions of polypeptide that are neither in strand, nor helix are defined as the structure tloop. The TOPOL clauses follows/2, is_parallel_to/2 and is_antiparallel_to/2 are then defined in terms of the tstrand, tloop and thelix clauses. For example:
follows(tstrand(175,184,1fb4l),tloop(172,174,1fb4l))
specifies that the given tstrand follows the tloop in the structure, as will be self evident from the residue numbers.
The full interface to TOPOL includes calls from Prolog to Fortran routines to fit straight lines through helices and strands and to calculate overlaps, distances and angles. The details of this interface are under further development and will be described elsewhere.