The multiple alignment output is a ``pseudo'' multiple alignment that has the same length as the query sequence. It shows the query sequence as the first row of the alignment, with the database sequences below. Insertions in database sequences are simply discarded in this output, so it should NOT be used for full multiple alignment analysis (but see below for a way to do this).
The alignment is output in blocks of 50 by default, though this can be changed with the MULTIPLE_OUTPUT_LENGTH command line option.
# Multiple alignment Start for iteration: 0 # Format: Simple # Residues_per_line: 50 # # ID Evalue Start End Len # 1 SCANPS_TEST 0 1 74 74: YPGFNPSVDAEAIRKAIRGIGTDEKTLINILTERSNAQR ANX3_RAT 1.98e-20 16 89 324: YPGFNPSVDAEAIRKAIKGIGTDEKTLINILTERSNAQR ANX3_MOUSE 9.3e-20 15 88 323: YPGFSPSVDAEAIRKAIRGLGTDEKTLINILTERSNAQR ANX3_HUMAN 7.02e-17 15 88 323: YPDFSPSVDAEAIQKAIRGIGTDEKMLISILTERSNAQR ANX5_MOUSE 1.7e-08 10 83 319: FPGFDGRADAEVLRKAMKGLGTDEDSILNLLTSRSNAQR ANX5_RAT 6.92e-08 9 82 318: FSGFDGRADAEVLRKAMKGLGTDEDSILNLLTARSNAQR ANX5_HUMAN 6.94e-08 11 84 319: FPGFDERADAETLRKAMKGLGTDEESILTLLTSRSNAQR ANX5_BOVIN 6.97e-08 11 84 320: FPGFDERADAETLRKAMKGLGTDEESILTLLTSRSNAQR ANXB_HUMAN 9.21e-08 198 270 505: -PGFDPLRDAEVLRKAMKGFGTDEQAIIDCLGSRSNKQR ANX8_MOUSE 1.67e-07 21 91 327: ---FNPDPDAETLYKAMKGIGTNEQAIIDVLTKRSNVQR ANX8_HUMAN 1.67e-07 21 91 327: ---FNPDPDAETLYKAMKGIGTNEQAIIDVLTKRSNTQR ANX6_BOVIN 2.08e-07 304 378 618: -PGFNPDADAKALRKAMKGLGTDEDTIIDIITHRSNAQR ANX6_BOVIN 5.57e-07 38 107 618: -----PAADAKEIKDAISGIGTDEKCLIEILASRTNEQH ANX5_CHICK 2.49e-07 14 85 321: -P-FDARADAEALRKAMKGMGTDEETILKILTSRNNAQR ANX1_CHICK 2.53e-07 35 107 130: -PNFDPSADVSALDKAITVKGVDEATIIDILTKRTNAQR ANX6_HUMAN 3.51e-07 16 89 672: FPGFDPNQDAEALYTAMKGFGSDKEAILDIITSRSNRQR ANX6_HUMAN 8.19e-07 92 161 672: -----PACDAKEIKDAISGIGTDEKCLIEILASRTNEQH lines deleted ANX7_DICDI 0.000126 166 231 462: QIKREFSAKYSKDLIQDIKSETSGNFEKCLVALL ANX2_XENLA 0.000152 33 103 339: DIAFAFHRRTKKDLPSALKGALSGNLETVMLGLI ANX2_XENLA 0.00126 107 174 339: LDIQNYRELFKTELEKDIMSDTSGDFRKLMVAL- ANX1_RODSP 0.000155 38 111 345: HLKAVYQETGE-PLDETLKKALTGHIQELLLAMI ANXD_HUMAN 0.00016 10 83 315: QIKQKYKATYGKELEEVLKSELSGNFEKTALALL ANXD_HUMAN 0.000184 86 155 315: IAIKEYQRLFDRSLESDVKGDTSGNLKKILVSLL ANX9_HUMAN 0.0002 31 104 338: LISRNFQERTQQDLMKSLQAALSGNLERIVMALL ANX4_FRAAN 0.000243 8 76 314: EIRAAYEQLYQEDLLKPLESELSGDFEKAV---- AN11_COLLI 0.000309 37 107 341: RIKAAYHKAKGKSLEEAMKRVLKSHLEDVVVALL ANX9_MOUSE 0.000405 35 104 338: LISRAFQERTKQDLLKSLQAALSGNLEKIVVALL # # Multiple alignment End for iteration: 0 #
The first two columns should be self-explanatory. The start and end refer to the first and last residue position within the database sequence, while ``Len'' refers to the database sequence length so that you can see if there are any pathologically short sequences in the alignment. This alignment is used in iteration to build a profile for subsequent searches, which sequences are included in the alignment is controlled by the probcut2 command. By default probcut2 is set to 0.1.
You can obtain a FASTA formatted sequence file that contains the complete sequence fragments found in the database search by adding
-pff 1 -frag_file_out frags.fa
to the command line. This will create a file called ``frags.fa'' that contains the fragments between Start and End in each line of the multiple alignmnent, but without any internal deletions. You can feed this file to clustal or another multiple alignment program for further analysis.