JPred - Accuracy

Documentation - Accuracy (How long is a piece of string then?)

The following data is presented as a *brief* comparison only, for greater depth see the Jpred and Jnet papers in the reference page, specifically the two Cuff & Barton Proteins papers.

Jpred was originally developed as a result of a study to compare various contemporary secondary structure prediction methods. During the evaluation two independent sets of protein sequences were developed that allowed the validation of both methods. Both sets come from proteins deposited in the PDB between 1995-2000 and are non-redundant to 5 standard deviations when using the Z-score method to determine sequence similarity. The smaller set (396 sequences) has no members that are similar to the sequences in the set of 126 sequences produced by Rost & Sander (Medline) or within themselves and had full DSSP definitions. The 406 protein data set was derived from structures not found in the 397 set, and so provides a true test of each prediction method (see Cuff & Barton, 2000 for details).

The Rost & Sander set was used to train a number of secondary structure methods (PHD, NNSSP, DSC and PREDATOR) and so is included to compare those methods accuracy between the training data sets and data not seen before by the methods.

In the table below, the average Q₃ results for the three sets are shown.

Method	126 protein set	396 protein set	406 protein set
PHD	73.5	71.9	73.3
DSC	71.1	68.4	70.6
PREDATOR	70.3	68.6	70.7
NNSSP	72.7	71.4	72.3
Mulpred	67.2	66.1	N/A
Zpred	66.7	64.8	62.0
Consensus (Jpred)	74.8	72.9	74.6
Jnet	N/A	N/A	76.4

The absolute accuracies are only meaningful for PHD and DSC, since the versions of PREDATOR and NNSSP used for this test were trained on a larger set of proteins which included the 126 sequences.

All results shown here used DSSP as the secondary structure definition method to compare the predictions against. The DSSP 8-state definition was reduced like so: H and G to Helix, E and B to strand, with all other states becoming coil. The alignments used for these predictions were generated automatically with the same method that was used by Jpred.

The accuracy of 76.4% is what can be expected with the service today, which uses only Jnet. As such, Jpred is the server which uses Jnet to make predictions, rather than a consensus of other methods.

The Barton Group

A consensus method for protein secondary structure prediction

Documentation - Accuracy (How long is a piece of string then?)