Figure 1 Physico-chemical Properties of the Amino Acids
(a) The 20 common amino acids are shown in terms of ten physico-chemical properties [Zvelebil et al., 1987][Taylor, 1986]. Grey filled areas define sets of properties possessed by none of the common amino acids. The hydrophobic, polar and small sets dominate the figure. The remaining sets define subsidiary groups. The dotted line joining L to R shows the minimum number of five set boundaries which must be crossed in order to change a L to an R in this ten property diagram (see text).
(b) An amino acid property index derived from the Venn diagram in Figure 1a (after [Zvelebil et al., 1987], treating Cys as ). The columns represent the amino acids while rows represent properties. Filled circles show when an amino acid possesses a property. represents gap which, in this index, is regarded as having all properties.
Figure 2 Calculation of Conservation Numbers
The Venn diagram showing the relationship between the amino acids on the basis of charge (a) is converted to a property index (b) which is used to analyse the conservation of charged residues in the sequence alignment (c). The amino acids present at each sequence position are recorded (d) and tested for each of the properties in the index (e). Columns of filled (presence of a property) and empty (lack of a property) circles record the properties of each amino acid in the same vertical order as in the property index. The presence of properties is summed (e), filled circles show positive conservation of a property in the group of amino acids, shaded circles show where properties are present in some but not all of the amino acids, and empty circles show negatively conserved properties. A conservation score is arrived at by summing either the number of positively and negatively conserved properties (g - method 1) or the number of positively conserved properties alone (h - method 2) (See text).
Figure 3 Hierarchical Conservation Analysis
A 10 residue fragment of a multiple sequence alignment of 26 sequences is shown to the right of the figure. The relationship between the sequences in the whole alignment is represented by the dendrogram to the left which shows three sub-groups, A, B and C. Each position of the groups in the multiple sequence alignment has been analysed for residue conservation using the property index in Figure 1b. The conservation threshold was set to 8. Information about the conservation pattern is given at the foot of the alignment in numerical and graphical form. The representation of the alignment and the conservation patterns to the right of the figure were imported directly from the graphical output of the program AMAS.
Figure 4 Text Representation of Sequence Conservation
With reference to Figure 3. The text representation of the analysis gives a more detailed description of the conservation of physico-chemical properties at each alignment position. Each record identifies the sequence position to which it refers (rounded brackets), the sub-group(s) involved in the pattern being reported, the pair conservation number(s) of those groups where non-identities are reported (rounded brackets), the residues present in each group (square brackets) and the properties which are conserved by them and which differ between them. Differences in properties between sub-groups are reported; the percentage of residues in each sub-group that have a property is shown in square brackets.
Figure 5 Charge Conservation in 40 Annexin Repeats
(a) The pattern of conserved charge in 40 annexin repeats determined using the charge property index described in Figure 2. Only positive property conservation is considered at a conservation threshold of 2, this means that a sub-group position must conserve both charge and polarity to be reported. Conserved positions alone are reported in order to highlight the pattern of charged residues; the residues at unconserved positions have been masked out. Two gaps, and residues constituting less than 10%of a sub-group position have been screened from the conservation calculation. Identities and conserved positions are identified according to the shading protocol given in Figure 3. A charge difference is clearly seen in the histogram at position 31, reflecting the switch between a conserved E (negative) in repeat 2 and a conserved R (positive) in repeat 4.
(b) Text output accompanying the analysis in Figure 5a. The record format used is identical to that used in Figure 4.
Figure 6 Conservation Analysis of 67 SH2 Domains
An alignment of 67 SH2 domains analysed using the general property index (Figure 1b). A key to the shading strategy is given in Figure 3 (see text). The mean pair conservation number for conserved sub-group pairs at each position is reported below the histogram if it is equal to or exceeds the threshold of 7 for the plot. One gap per sub-group was ignored.