Next: Tree based alignment Up: Multiple Alignment Just Previous: Multiple Alignment Just

Single order alignment

This is the method described in Barton and Sternberg (1987b). The sequences are aligned progressively in one order. If the sequences are all clearly similar to each other - say > 50%identity, then the actual order will make very little difference to the final alignment. In general however, it is better to establish the order by first performing all pairwise comparisons for the sequences as described in the previous section, then using program ORDER to define an order (described below) before proceeding to multiple alignment.

The ORDER program generates an order_file. This specifies the order in which the sequences present in the seq_file are to be aligned. A typical command file for multiple alignment is given in globin_mult.com and shown below.


Commands                                      Explanation number
--------                                      ------------------

output_file=globin_mult.out                   1.
mode=multiple                                 2.
matrix=file=ampsdir:md.mat                    3.
gap_penalty=8.0                               4.
constant = 8                                  5.
consplot=mz                                   6.
print_vertical=                               7.
seq_file=globin.seq                           8.
order_file=globin.ord                         9.

  1. Command to MULTALIGN. set the file for output of results to 'globin_mult.out' This command must ALWAYS be the first. Note: The output file should NOT be set to the log file name.

  2. Specify that multiple mode is to be used. If this command is not included, the program defaults to mode=multiple.

  3. Define the matrix file to be used in the comparisons. This file is the Dayhoff mutation data matrix or similar file containing pairscore values for each amino acid pair.

  4. define a gap penalty of 8.0

  5. define a constant of 8 to be added to the matrix defined in 5.

  6. optional request to perform a conservation analysis using the algorithm of Zvelebil et al (1987) on the resulting alignment (only works if the print_vertical command is also present).

  7. specify vertical format output

  8. define the file containing sequences to be aligned as 'globin.seq'

  9. define the order file obtained from running theprogram ORDER on the results of a pairwise sequence comparison run. If this command is absent then the program aligns the sequences in the order that they are present in the seq_file. Note. the order_file command must always appear after the seq_file command.

Optionally a process of iteration can be performed. Once all the sequences have been added to the alignment, the first sequence can be realigned with the ALIGNMENT of the sequences 2-N. Then the second sequence is aligned with 1,3-N and so on. To specify that iteration should be performed include the command 'iterations=int' where int is an integer greater than 0. In general iteration values of 3 can refine the alignments slightly and are of greatest use if there are relatively few sequences to be aligned (say <10).

The result of a run using this command file is illustrated in the file globin_mult.out.



Next: Tree based alignment Up: Multiple Alignment Just Previous: Multiple Alignment Just


gjb@bioch.ox.ac.uk