Hints and tips for protein secondary structure prediction with Jpred
- Protein secondary structure prediction algorithms are trained with globular proteins. Unless the algorithm is specifically trained only with transmembrane proteins (such as PHDhtm), secondary structure prediction is very likely to be wrong for transmembrane segments.
- Coiled coils are another special type of secondary structure element. If the coiled coil prediction methods predict a coiled coil you will need to re-examine your sequence.
- Try to filter your sequence so that it contains no transmembrane or coiled coil regions prior to prediction. Secondary structure prediction methods apply a window of up to 19 residues during prediction. If there are coiled coil or TM regions within your protein, the predictions at the boundaries of coiled coil or transmembrane regions and the rest of the sequence are likely to be incorrect.
- Secondary structure prediction methods are much more accurate (on average ca. 6-7%) when a multiple sequence alignment is applied. Jpred creates one for you using PSI-BLAST, but a user created one is likely to result in a better prediction.
- If Jpred finds a PDB structure that matches your query, think about why you are predicting the secondary structure. If a structure is available that shares significant sequence similarity to your query, it will make a much better starting point for model building than a secondary structure prediction. If you want to see how the predictions compare, by all means switch off the automatic scan.
- Jpred generates multiple sequence alignments with low redundancy. Redundant sequences in the alignment cause the profiles used in the prediction methods to become biased. Prediction methods are more accurate when a non redundant alignment is used. If you want to use your own alignment for prediction, make sure that any redundant sequences are removed prior to prediction.
- For an initial 'look see' the best approach is to run Jpred with a single sequence. This way Jpred will have an accurate prediction, and it also generates an automatic multiple sequence alignment.
- Jnet predicts solvent accessibility at three thresholds. 25%, 5% and 0%. Residues predicted at 0% exposure represent those residues that are likely to be 'very buried'.