Next: Generating Alignments of Up: No Title Previous: Getting Help

Extracting Sequences from a PIR Database Using PROGRAM SELECT

If you have access to a database management program (e.g. NBRF PSQ) then you may use that to extract the sequences of interest from the database. However, PROGRAM SELECT provides a method of extracting sequences from any PIR format sequence file. For example, given the file: 'bash_ge_4_scan1.top20id' obtained above and the PIR database file 'protein.seq'.



Program S E L E C T

Extracts sequences from PIR database

Author: G. J. Barton (1990)
Maximum Allowed Sequence Length: 8000
Maximum Allowed Number of Sequences: System Dependent

Enter name of file containing SCORE ID pairs: bash_ge_4_scan1.top20id

Opening File: bash_ge_4_scan1.top20id

Enter Database Filename: protein.seq

Opening File: protein.seq

Just Extract Identifiers/titles (no sequences) ?[Y/N]: N

Enter Output Filename: bash_ge_4_scan1.top20seq

Opening File: bash_ge_4_scan1.top20seq

Searching for: 20 Sequences
Found: HACHPE     1
Found: HADKP     2
Found: HZHU     3
Found: HZCZ     4
Found: HZPG     5
Found: HGHUA     6
Found: HGMQP     7
Found: HGMQR     8
Found: HGMQJ     9
Found: HGBAY    10
Found: HGMKS    11
Found: HBRB3    12
Found: HBMSH0    13
Found: HEGT1    14
Found: HEMSY2    15
Found: HBTG    16
Found: HBOR    17
Found: HBPY    18
Found: HBTTP    19
Found: HBFG3T    20
Extracted: 20 Sequences

The information displayed to the screen signals when each sequence is found, the sequences are sorted into descending score order and output to the chosen file (bash_ge_4_scan1.top20seq).

gjb@bioch.ox.ac.uk