S. E. Brenner, C. Chothia, and T. J. P. Hubbard.
Assessing sequence comparison methods with reliable structurally
identified distant evolutionary relationships.
Proc. Nat. Acad. Sci., 95:6073-6078, 1998.
G.J. Barton and M.J.E. Sternberg.
A strategy for the rapid multiple alignment of protein sequences:
Confidence levels from tertiary structure comparisons.
J. Mol. Biol., 198:327-337, 1987.
K. Karplus, K. Sjolander, C. Barrett, M. Cline, D. Haussler, R. Hughey,
L. Holm, and C. Sander.
Predicting protein structure using hidden markov models.
Proteins, Suppl. 1:134-139, 1997.
J. Park, S. A. Teichmann, T. Hubbard, and C. Chothia.
Intermediate sequences increase the detection of distant sequence
homologues.
J. Mol. Biol., 273:349-354, 1997.
C. Lemer, M. J. Rooman, and S. J. Wodak.
Protein structure prediction by threading methods: Evaluation of
current techniques.
Proteins, 23:337-355, 1996.
P. Y. Chou and G. D. Fasman.
Conformational parameters for amino acids in helical,
-sheet, and random coil regions calculated from proteins.
Biochem., 13:211-222, 1974.
J. Garnier, D. J. Osguthorpe, and B. Robson.
Analysis and implications of simple methods for predicting the
secondary structure of globular proteins.
J. Mol. Biol., 120:97-120, 1978.
C. D. Livingstone and G. J. Barton.
Identification of functional residues and secondary structure from
protein multiple sequence alignment.
Meth. Enz., 266:497-512, 1996.
I. P. Crawford, T. Niermann, and K Kirchner.
Prediction of secondary structure by evolutionary comparison:
Application to the alpha subunit of tryptophan synthase.
Proteins, 2:118-129, 1987.
G. J. Barton, R. H. Newman, P. F. Freemont, and M. J. Crumpton.
Amino acid sequence analysis of the annexin super-gene family of
proteins.
European J. Biochem., 198:749-760, 1991.
R. B. Russell, J. Breed, and G. J. Barton.
Conservation analysis and structure prediction of the SH2 family of
phosphotyrosine binding domains.
FEBS Letters, 304:15-20, 1992.
S.A. Benner and D. Gerloff.
Patterns of divergence in homologous proteins as indicators of
secondary and tertiary structure: A prediction of the structure of the
catalytic domain of protein kinases.
Adv. Enz. Reg., 31:121-181, 1990.
M. J. J. M. Zvelebil, G. J. Barton, W. R. Taylor, and M. J. E. Sternberg.
Prediction of protein secondary structure and active sites using the
alignment of homologous sequences.
J. Mol. Biol., 195:957-961, 1987.
R. D. King and M. J. E. Sternberg.
Identification and application of the concepts important for accurate
and reliable protein secondary structure prediction.
Prot. Sci., 5:2298-2310, 1996.
A. A. Salamov and V. V. Solovyev.
Prediction of protein secondary structure by combining nearest-
neighbor algorithms and multiple sequence alignments.
J. Mol. Biol., 247:11-15, 1995.
V. Biou, J. F. Gilbrat, B. Robson, and J. Garnier.
Secondary structure prediction: combination of three different
methods.
Prot. Eng., 2:185-191, 1995.
K. Nishikawa and T. Ooi.
Amino acid sequence homology applied to the prediction of protein
secondary structures, and joint prediction with existing methods.
Biochem.. Biophys. Acta, 871:45-54, 1986.
C. Geourjon and G. Deleage.
Sopma : Significant improvements in protein secondary structure
prediction by consensus prediction from multiple alignments.
Comp. App. Biosci., 11:681-684, 1995.
F. M. Richards and C. E. Kundrot.
Identification of structural motifs from protein coordinate data:
secondary structure and first-level supersecondary structure.
Proteins, 3:71-84, 1988.
S. B. Needleman and C. D. Wunsch.
A general method applicable to the search for similarities in the
amino acid sequence of two proteins.
J. Mol. Biol., 48:443-453, 1970.
A. Murzin, S. E. Brenner, T. Hubbard, and C. Chothia.
Scop: A structural classification of proteins database and the
investigation of sequences and structures.
J. Mol. Biol., 247:536-540, 1995.
M. Newman, C. Frazao, G. Khan, I. J. Tickle, T. L. Blundell, M. Safro,
N. Andreeva, and A. Zdanov.
X-ray analyses of Aspartic Proteinases. structure and refinement
at 2.2 Angstroms resolution of Bovine Chymosin.
J. Mol. Biol., 221:1295, 1991.
A. Sali, B. Veerapandian, J. B. Cooper, S. I. Foundling, D. J. Hoover, and
T. L. Blundell.
High resolution x-ray diffraction study of the complex between
endothiapepsin and an oligopeptide inhibitor. the analysis of the inhibitor
binding and description of the ridgid body shift in the enzyme.
EMBO J., 8:2179, 1989.
M.Bolognesi, G.Gatti, E.Menegatti, M.Guarneri, M.Marquart, E.Papamokos, and
R.Huber.
Three dimensional structure of the complex between pancreatic
secretory inhibitor (kazal type) and trypsinogen at 1.8 angstroms resolution.
J. Mol. Biol., 162:839, 1982.
R.B.Honzatko, W.A.Hendrickson, and W.E.Love.
Refinement of a molecular model for Lamprey Hemoglobin from
Perromyzon Marinus.
J. Mol. Biol., 184:147, 1985.
J.L.Smith, P.W.R.Corfields, W.A.Hendrickson, and B.W.Low.
Refinement at 1.4 Angstroms resolution of a model of Erabutoxin
B. treatment of ordered olvent and discrete order.
Acta Cryst., 44:357, 1988.
P.M.D.Fitzgerald, B.M.Mc Keever, J.F.Van Middlesworth, and J.P.Springer.
Crystallographic analysis of a complex between Human
Immunodeficiency Virus Type 1 Protease and Acetyl Pepstatin at
2.0 Angstroms resolution.
J. Biol. Chem., 265:14209, 1990.
D. Frishman and P. Argos.
Incorporation of non-local interactions in protein secondary
structure prediction from the amino acid sequence.
Prot. Eng., 9:133-142, 1996.
J. D. Thompson, D. G. Higgins, and T. J. Gibson.
CLUSTAL W: improving the sesitivity of progressive multiple
sequence alignment through sequence weighting, positions-specific gap
penalties and weigh matrix choice.
Nuc. Ac. Res., 22:4673-4680, 1994.
J. U. Bowie, R. Luthy, and D. Eisnenberg.
A method to identify protein sequences that fold into a known
three-dimensional structure.
Science, 253:164-170, 1991.
A. Wlodawer, M.Miller, and M.Jaskolski.
Crystal structure of a retroviral protease proves relationship to
aspartic protease family.
Nature, 337:576, 1989.
K.Petratos, Z.Dauter, and K.S.Wilson.
Refinement of the structure of Pseudoazurin from Alcaligenes
Faecalis S-6 at 1.55 Angstroms.
Acta Cryst., 44:628, 1988.
N.K.Vyas, M.N.Vyas, and F.A.Quiocho.
Sugar and signal transducer binding sites of the escherichia coli
galactose chemoreceptor protein.
Science, 242:1290, 1988.
E.Weber, E.Papamokos, W.Bode, R.Huber, I.Kato, and M. Laskowski.
Ovomucoid, a Kazal-type inhibitor, and model building studies of
complexes with serine proteases.
J. Mol. Biol., 158:515, 1982.
T.O.Fischmann and R.J.Poljak.
Crystallographic refinement of the three-dimensional structure of
FAB D1.2 Lysozyme complex at 2.5 Angstroms.
J. Biol. Chem., 266:12915, 1991.
Table 1:
Pairs in the RS126 set that have an SD score of greater than 5. Alignments were generated by the AMPS package[6] a blosum62 matrix, and gap penalty of 10, with 100 randomisations. Fold definitions come from the
current release (1.37) of the SCOP database [47]
Table 7:
Family size for the automatically generated alignments for the RS126 protein set, considering 2 levels of
BLAST[59] p-value cutoff
p-Value cutoff 10-10
p-Value cutoff 10-2
Total Number of Residues
1716356
2013632
Total Number of Sequences
7013
8974
Average Number of Sequences per Family
55.6
71.2
Table 8:
Comparison of the of Q3 accuracy for a decrease in the BLAST[59] P-value cut-off from 10-10 to 10-2 with the RS126 set. (The alignments used for these predictions did not use a percentage identity filter)
Table 13:
Results for single sequence prediction methods via a full jack-knife test.
The column 'Author' is the authors jack-knife value for the method with their dataset, and
definition reduction method.
All results are calculated using reduction method A, and also converting G and B states to coil.
For PHD[64] the authors quote 71.6% as their cross-validated accuracy.
However, G and B states were considered in the accuracy calculation for PHD[64]