Next: Installation of scanps Up: No Title Previous: Other functions of

How to build an indexed database for use with scanps

Two programs are used to build the indexed database.

simclean takes the PIR .seq file and removes any blank space from the sequence part of the file. Each sequence entry is reduced to three lines and three return characters.

id_pir3 takes the cleaned up .seq file and generates two index files, .bin and .inx.

To run the two programs on a .seq file called ``pir1.seq'' type:



simclean < pir1.seq > pir1.clean

cp pir1.seq pir1.seq.safe

cp pir1.clean pir1.seq

id\_pir3 pir1.seq pir1.bin pir1.inx

If this all works, you should have three files pir1.seq, pir1.bin and pir1.inx.

These database files should be placed in a single directory and the environment variable GJNDBDIR set to the directory name. The environment variable GJNDBROOT should be set to the database name, in the example ``pir1''. In this way, multiple databases can reside in the same directory. If you want to scan using a different database, you just redefine the GJNDBROOT variable.

For example, if we want to use the database called ``brookhaven'', we'd just type:



setenv GJNDBROOT brookhaven

scanps and sortsco would then expect to find the files brookhaven.seq, brookhaven.inx and brookhave.bin in the directory defined by GJNDBDIR.


gjb@bioch.ox.ac.uk