next up previous contents
Next: Implementation Up: Materials and Methods Previous: Increasing the Speed

Screening the Results

To be useful any automatic algorithm must be able to tell when the definitions it has produced are likely to disagree with the expected standard. Three rules about domains were derived to enable the algorithm to identify such examples.

  1. Count the number of segments in a single domain protein.

    Single domain proteins may have chopped segments removed which are later re-assigned or may be split into domains that are recombined on the basis of compactness. If the number of segments that the final domain was split into is large, then the domain is unlikely to be a true single domain protein. Single domain proteins made up of 4 or more segments were flagged for further visual inspection (table on diskette).

  2. Calculate the number of residues per segment for domains consisting of two or more segments.

    If this is small it is unlikely that the domain is a real domain. This suggest a lower limit on the size of such domains, which is larger than MDS. The limit chosen was 50 residues per segment (table on diskette).

  3. For a single segment domain inserted into a domain of two or more segments, calculate the ratio of the size of the domain into which the inserted domain is placed to the size of the inserted domain.

    If the ratio is large, the inserted domain is unlikely to be a real domain. The limit set was 1.6 (table on diskette).



as@bioch.ox.ac.uk