Next: Matrix File format Up: File formats Previous: Sequence file format

Multiple alignment query format

This should be in AMPS block file format.


The minimum requirements for a block file for N aligned sequences are
1.   N  '>comment line(s)'
2.  '* iteration int'
3.  'N or more vertically aligned sequences'
4.  '*'

  1. The comment lines define the sequence identifiers and the number of '>' characters preceding the first '* iteration int' line define the number of sequences that are defined in the sequence lines.

  2. This line specifies the beginning of the alignment to be read. The '*' character specifies the column in which the alignment begins. The 'iteration int' specifier identifies the particular alignment within this block_file.

    The format allows several alternative alignments to follow each other providing they are identified by a different iteration number (eg. 1,2,3). Currently, SCANPS only reads the first alignment. See the AMPS documentation for further details of alternative multiple alignments.

  3. The alignment is ended by a '*' character which should be in the same column as the '*' character that started the alignment.

Simple example:


This is a block file containing two alternative alignments of three sequences.
The comments that I an writing here may appear in the block file, but are
ignored when the file is read.  The only proviso is that no
'greater than' or 'star' characters must be present.

>first  this is sequence A
>second this is sequence B
>third  This is sequence C
* iteration 1
A  
A P
AVG
LLG
LCR
G
 PG
WWW
S	
*


gjb@bioch.ox.ac.uk