Helix definitions

Next: Beta Structure Definitions Up: Secondary Structure Definition Previous: Secondary Structure Definition

Helix definitions

In order to illustrate the definition of secondary structure using Kabsch and Sander's rules the complete Prolog code for the definition of helix structure is shown here. The basic building block for the definition of a helix is the kturn rule. This looks up the hydrogen bonding information stored in the ks clauses and if an pattern with an energy of at least -0.5 KCal is seen, then a turn is defined.



/* kturn rule*/

kturn(N,N3,CID,Type):-
     ks(N,_,CID,_,_,_,_,_,_,_,[B1,E1],_,[B2,E2],_,_,_),
     turn_type(NN,Type),
     N3 is N + NN,
     B1A is N + B1,
     B2A is N + B2,
     check_bond(N3,[B1A,E1],[B2A,E2]).

/* turn_type facts */

turn_type(3,three_turn).
turn_type(4,four_turn).
turn_type(5,five_turn).
turn_type(6,six_turn).

/* check_bond rules */

check_bond(N3,[N3,E1],_):-
           E1 =< -0.5.
check_bond(N3,_,[N3,E2]):-
           E2 =< -0.5.

The kturn rule operates by first looking up the unique residue number, offset and energy of the two hydrogen bonds listed as O-HN hydrogen bonds for that residue. A turn type is then looked up (e.g. three_turn) to obtain the required offset for a hydrogen bond (in this example, 3). The check_bond rule is then used to see if a hydrogen bond exists between the current residue and one three further down the polypeptide chain.

Given the kturn, the definition of a minimal helix is readily expressed as:



minimal_helix(N,N2,CID,Type):-
    kturn(N,Nend2,CID,Type),
    NM1 is N - 1,
    kturn(NM1,_,CID,Type),
    N2 is Nend2 - 1.

Prolog rules then define the start and end points of a helix:



helix_start(N,CID,Type):-
      minimal_helix(N,_,CID,Type),
      NM1 is N - 1,
      \+ in_minimal_helix(NM1,CID,Type).     % \+ means 'not'
 
helix_end(N,CID,Type):-
      minimal_helix(_,N,CID,Type),
      NP1 is N + 1,
      \+ in_minimal_helix(NP1,CID,Type).

in_minimal_helix(N,CID,Type):-           
      nonvar(N),                         %must call with N instantiated
      minimal_helix(N1,N2,CID,Type),     %i.e. N must have a value before 
      N >= N1,                           %                       the call.
      N =< N2,
      !.                                 %succeed only once.

The rule in_minimal_helix succeeds once if the residue number is found within a minimal helix. The helix_start rule reads `residue is the first residue in a helix if it is the first residue in a minimal helix, and residue is not in a minimal helix'. Similarly for the helix_end rule.

Three helix_type facts link the different types of turn with - , - and five- helix names.



helix_type(three_turn,three_ten).
helix_type(four_turn,alpha).
helix_type(five_turn,five).

Finally, we can write the helix rule by making use of the helix_start and helix_end rules.




helix(N1,N2,CID,Htype):-
  helix_start(N1,CID,Type),
  find_helix_end(N1,CID,Type,N2),
  helix_type(Type,Htype).

find_helix_end(N,CID,Type,N2):-
  helix_end(N2,CID,Type),
  N2 > N,
  !.

The helix rule reads `find the start of a helix, find the first end of helix that follows this start, then look up the helix type.'

The Prolog rules for helix, make up the computer program that defines helix structures. However, a feature of Prolog is that any rule can be replaced by a collection of facts. For example, the kturn rule can be replaced or augmented by facts that are specific to a particular protein:



kturn(416,419,1fb4h,three_turn).
kturn(418,421,1fb4h,three_turn).
kturn(418,422,1fb4h,four_turn).
kturn(430,434,1fb4h,four_turn).
kturn(430,435,1fb4h,five_turn).
kturn(431,434,1fb4h,three_turn).

these facts may then be used in exactly the same way as the general purpose rule with the same name and number of arguments. In this way, if a rule is particularly time consuming to evaluate, it need only be evaluated once for the protein, then stored as a set of facts. Other rules that subsequently make use of the rule need only look up the corresponding facts in the database, rather than repeat the time-consuming rule evaluation. This principal is similar to storing intermediary results in a conventional Fortran or C program. The difference in Prolog is that the routines that access the pre-calculated data are identical to those that access the routines that initially calculated the data.

Next: Beta Structure Definitions Up: Secondary Structure Definition Previous: Secondary Structure Definition

gjb@bioch.ox.ac.uk