The examples have so far assumed that the ks clauses for only one protein are resident in the Prolog database. However, a query should be able to be applied to more than one protein at a time. One approach to this problem would be to load ks clauses into memory for all proteins to be studied. An alternative approach is to arrange for Prolog to access the clauses as they reside on disk, rather than reading them into memory. This dilemma is central in the design of large Prolog systems and is a subject of continuing research, some solutions are raised in the discussion.
The limited memory of the Sun-3/50 workstation on which our system was originally developed eliminated the possibility of reading data on all proteins into the Prolog system. Accordingly, a simple solution was adopted whereby each protein is loaded in turn for analysis. In order to economise on disk space, the secondary structure definitions for each protein are not pre-calculated, but performed `on the fly' as the query is executed. The scan_with/2 facts, and get_protein/2 rules manage this operation. For example:
scan_with(helix,ScanList), Plist = [1fb4,1sgt,4fxn],member(PID,Plist), get_protein(PID,ScanList), list of goals using helix definitions go here, fail.
The scan_with fact returns a list of procedures that are to be executed by the get_protein procedure. In this example, the ScanList returned would take the value of [kturn, minimal_helix, helix_start, helix_end,helix], specifying the rules for structural units that are to be used. Plist is simply a list of the identifiers for the proteins that are to be analysed. The member/2 rule returns successive members of the Plist on backtracking, and thus feeds each value of PID in turn to the get_protein/2 procedure.
A call to get_protein first loads the ks clauses for the specified protein, then loads the general rules for helix definition (kturn, minimal_helix, etc...). All solutions to these general rules are then found for the protein and the specific structural facts asserted into the Prolog database. Having loaded all specific facts for the protein, the particular goals that require the secondary structure definitions are executed.