Extracting the sequences from the database

Next: Generating the local Up: Simple Scanning - Previous: Examine the hit

Extracting the sequences from the database



select
Program S E L E C T

Extracts sequences from PIR database

Author: G. J. Barton (1990)
Maximum Allowed Sequence Length: 8000
Maximum Allowed Number of Sequences: 2000

Enter name of file containing SCORE ID pairs: sh2.top15

Opening File: sh2.top15


Opening File: /data/pir/pir38.seq

Just Extract Identifiers/titles (no sequences) ?[Y/N]: 

Enter Output Filename: sh2.top15.seq

Opening File: sh2.top15.seq

Searching for: 15 Sequences
1 A34104
2 A43610
3 B34104
4 OKFVYR
5 S15582
6 S20676
7 S20808
8 TVCHS
9 TVFV60
10 TVFVMT
11 TVFVPR
12 TVFVR
13 TVFVS1
14 TVFVS2
15 TVHUSC
Found: S20676     1
Found: S20808     2
Found: S15582     3
Found: A34104     4
Found: B34104     5
Found: A43610     6
Found: TVHUSC     7
Found: TVCHS     8
Found: TVFV60     9
Found: TVFVMT    10
Found: TVFVPR    11
Found: TVFVR    12
Found: OKFVYR    13
Found: TVFVS2    14
Found: TVFVS1    15
Extracted: 15 Sequences

You have supplied the name of the file containing score, id pairs (sh2.top15) then the name for a file to save the sequences to (sh2.top15.seq), select then lists the identifiers it is searching for and as they are found in the database, it lists them to the screen again. The sequences are saved in the output file in the same order as they are shown in the sh2.top15 file.

If you have access to a more sophisticated database program, then you may prefer to use that to extract the sequences. For example, the program ``sortsco'' works much faster than ``select'' since it makes use of indexing - See Section 6.3.3 for details.

If you have not set the environment variables for the database file, then the program ``select'' will prompt you for the database filename.

gjb@bioch.ox.ac.uk