next up previous contents
Next: The effect on accuracy Up: Results and Discussion Previous: Effect on Q of

Effect on Q3 of reducing redundancy in multiple alignments

While all sequences that are not 100% identical in a multiple sequence alignment will contribute to the prediction, the most informative sequences are those with the greatest variation from the query. Here we test the effect of systematically removing sequences from the alignment that were similar to the query sequence at better than 95, 80, 75 and 60% identity. Table 9 summarises the effect of these thresholds. The average Q3 accuracy improves slightly as the percentage identity threshold is reduced. The consensus method improves by 0.3% at the 75% level. Since the predictions do not get any worse by removing redundant sequences and prediction methods run faster with fewer sequences, the 75% cutoff was used for all predictions, other than those shown in Table 8.

The average Q3 for each prediction when compared to DEFINE secondary structure definitions was between 3 and 8% worse than for DSSP and STRIDE definitions. As none of the prediction methods examined here were trained on DEFINE definitions, we do not consider comparison of predictions to DEFINE definitions any further.


next up previous contents
Next: The effect on accuracy Up: Results and Discussion Previous: Effect on Q of
james@ebi.ac.uk