next up previous contents
Next: Analysis of the test Up: Results and Discussion Previous: Results and Discussion

Comparison of secondary structure definition methods

All secondary structure prediction methods are trained and tested on secondary structure definitions from known structures. Defining secondary structure from co-ordinates is an inexact process due to differences in the concept of what is a secondary structure, as well as errors and inconsistencies in the experimental structure. This was illustrated in the comparison by Colloc'h et al of DSSP[38], DEFINE[39] and P-curve[71] on a non-redundant set of 154 proteins where all three methods agreed at only 63% of positions.

Here we show the differences between DSSP[38], DEFINE[39] and STRIDE[40] definitions on the RS126 protein set. When compared pairwise, DSSP and STRIDE agree to 95%, whereas DSSP and DEFINE agree at 73%, with STRIDE and DEFINE agreeing at 74%. All three methods agree at only 71% of positions.

Table 2 shows that DEFINE defines more sheet. 7.1% relative to DSSP and 5.6% relative to STRIDE. Helix is also defined more often by DEFINE. As a consequence of these two factors, DSSP and STRIDE define more coil than DEFINE, at 8.4% and 6.1% respectively.

The length of secondary structure elements as defined by DSSP, STRIDE and DEFINE is summarised in Table 3 and Figure 1. DEFINE does not define sheet regions of less than 4 residues. The mean segment length values for DEFINE are higher than those of STRIDE and DSSP for all secondary structure states. Figure 1 shows DSSP to have a peak in the helix distribution at 4 residues. However, this is not found with the STRIDE or DEFINE definitions. With the exception of the peak at 4, the overall shape of DSSP and STRIDE length distributions are similar.

When assessing prediction methods, the average Q3 was calculated for all the definition methods, for all runs, but because DEFINE is so dissimilar to DSSP and STRIDE, all results from DEFINE have been omitted from discussion.


next up previous contents
Next: Analysis of the test Up: Results and Discussion Previous: Results and Discussion
james@ebi.ac.uk