Next: Generating multiple pattern Up: No Title Previous: Extracting Sequences from

Generating Alignments of the Pattern With Interesting Sequences

We now have a pattern and a file containing sequences with which it gives a good score. In order to see the alignments of the pattern and each sequence, we must re-run the MULTALIGN program with the print_horizontal (or _vertical) command. This produces a VERY verbose output.

For example, the command file bash_ge_4_scan1_top20.com:



output_file=bash_ge_4_scan1_top20.alig
mode = scan			    
block_file/pattern=bash_ge_4.bloc,1 
matrix_file=md.mat		    
database=bash_ge_4_scan1.top20seq
print_horizontal=

produces the output shown in file bash_ge_4_scan1_top20.alig.

A far more compact form of output may be obtained by using the PRINT_HORIZONTAL/PATTERN=fname command. This produces a file in a special format for the program PATT.



output_file=bash_ge_4_scan1_top20.out
mode = scan			    
block_file/pattern=bash_ge_4.bloc,1 
matrix_file=md.mat		    
database=bash_ge_4_scan1.top20seq
print_horizontal/pattern=bash_ge_4_scan1_top20.patt

We can now run the program PATT on the resulting file:


 ------------------------
 Program    P A T T E R N
 ------------------------
  
 Processes FSCAN print_horizontal/pattern output
  
 Author:  Geoff Barton
  
 Maximum Pattern Length:     2000
 Maximum Pattern Hits + Pattern to Display:     2000
 
 Enter Pattern file: bash_ge_4_scan1_top20.patt
 
 Enter Output file: bash_ge_4_scan1_top20.pattout
 
 Enter page width for horizontal output (>50, Def:132): 
 Output width:     132
 Reading Pattern Description
 ---Initializing
 ---Done
 Reading Pattern Alignment
  153.14 >HZPG       141    1    1 Hemoglobin zeta chain - Pig                 
  152.57 >HZCZ       142    2    1 Hemoglobin zeta-1 chain - Chimpanzee        
  152.57 >HZHU       141    3    1 Hemoglobin zeta chain - Human               
  151.43 >HACHPE     141    4    1 Hemoglobin pi' chain - Chicken              
  149.57 >HADKP      141    5    1 Hemoglobin pi' chain - Muscovy duck         
  149.00 >HEMSY2     146    6    1 Hemoglobin epsilon-y2 chain - Mouse         
  148.00 >HBRB3      147    7    1 Hemoglobin gamma (beta-3) chain - Rabbit    
  147.29 >HBPY       146    8    1 Hemoglobin beta chain - Pigeon              
  147.14 >HBFG3T     146    9    1 Hemoglobin beta chain - Bullfrog tadpole    
  147.00 >HBMSH0     147   10    1 Hemoglobin beta-h0 chain - Mouse            
  146.86 >HBTG       146   11    1 Hemoglobin beta chain - Australian echidna  
  146.86 >HBTTP      146   12    1 Hemoglobin beta chain - Western painted turt
  146.71 >HGMQJ      146   13    1 Hemoglobin gamma chain - Japanese macaque   
  146.71 >HGMQR      146   14    1 Hemoglobin gamma chain - Rhesus macaque     
  146.71 >HGBAY      146   15    1 Hemoglobin gamma chain - Yellow baboon      
  146.71 >HGMQP      146   16    1 Hemoglobin gamma chain - Pig-tailed macaque 
  146.71 >HGHUA      146   17    1 Hemoglobin gamma chains - Human and chimpanz
  146.71 >HGMKS      146   18    1 Hemoglobin gamma chain - Spider monkey      
  146.57 >HBOR       146   19    1 Hemoglobin beta chain - Duckbill platypus   
  146.57 >HEGT1      147   20    1 Hemoglobin epsilon-I chain - Goat           
 
 Sort the scores? [Y] 
 Formatting Alignments
 Adding Flexible Gap Details
 Writing Vertical Format Alignment
 Writing Horizontal Format Alignment

The only options in this program are the output width for the results and the option to sort the scores that are displayed. The sorting option is necessary if multiple patterns per sequence are calculated using the PATTERN_LEVEL=N option described below.

The output file from the program PATT contains vertical and horizontal format multiple alignments of the pattern with all the sequences in the list. This is a considerable compression of data over the print_horizontal= format (file bash_ge_4_scan1_top20.alig). The vertical format output may be used to define a further pattern for database scanning.


gjb@bioch.ox.ac.uk