Next: block_file Up: MULTALIGN file formats Previous: order_file

tree_file

This defines the order in which a tree based multiple alignment is performed. The format is FIXED and very simple. The aim of tree based alignment at each stage is to perform a pairwise alignment on two clusters of sequences that have already been aligned, or are individual sequences. The tree_file defines which sequences belong to each cluster at each stage of the alignment.

For example, we may have 5 sequences 1,2,3,4,5. At each stage we are aligning two clusters of sequences A and B. The tree file might look like this.



     The tree				(my comments - not in the file)


     1					number of seqs in cluster A
     3					seq 3
     1					number of seqs in cluster B
     4					seq 4  (A and B are now aligned)
     1					Number of seqs in next cluster A
     1					seq 1
     1   				Number of seqs in next cluster B
     5					seq 5 (new A and B are now aligned)
     2					Number of seqs in next cluster A
     3    4				seqs 3 and 4
     2
     1    5                             seqs 1 and 5 ( now aliged to 3 and 4)
     1                                             .
     2                                             .
     4                                             .
     3    4    1    5                   Finally seq 2 is aligned to 3,4,1,5.

This tree can be gerated by program ORDER, or might be input from a more sophisticated clustering program.


Format(1x,20i5)
Example: globin_pairs.tree

Note. The sequence numbers identified in the tree_file point to the sequences as stored in the order internally by MULTALIGN. Normally an order_file would be used in conjuction with the tree_file so that similar sequences are clustered together on output. See the documentation on program ORDER for details of producing compatible tree and order files.


gjb@bioch.ox.ac.uk