The positions of invariant residues within the eukaryotic alignment and between the eukaryotic sequences and the bacteriophage and E. coli diadenosine tetra-phosphatase sequence are listed in Table 2.
The most accurate secondary structure predictions from an aligned family require sequences that are possible to align with little ambiguity, yet are sufficiently variable to highlight the positions most important to the common fold of the family. Unfortunately, whilst the eukaryotic phosphatase sequences analysed in this study are sufficiently similar to align well, in some regions the similarity is rather too great to enhance dramatically, a conventional single sequence secondary structure prediction. Bearing this limitation in mind, the summary secondary structure prediction is illustrated at the base of Figure 1.
All secondary structure predictions are subject to error, however, given an accurate alignment it is possible to assign the secondary structure of some regions with greater confidence than others. As a rough guide, the order of confidence in prediction is: loop (where insertions/deletions occur) > loop (conserved Gly/Pro/Hydrophilic) > surface helix (with clear hydrophobic patterns) > surface strand (with clear hydrophobic patterns) > buried strand (short run of conserved hydrophobic residues). Accordingly, we here describe the arguments supporting the prediction of loop, helix and strand and the reasons for ambiguity where this is shown in Figure 1.