Which database should I search? Local or network?

Next: Searching with dynamic Up: Database scanning Previous: Time considerations

Which database should I search? Local or network?

The answer to this question depends a little on why you are searching performing the search. If you have just determined a new sequence then it is essential that you search the most recent and up to date databases available to test if your new protein is unique. Having the most up to date database is less important if the aim is to gather a well known family of proteins together for multiple alignment as an aid to modelling.

The nucleic acid and protein sequence databases are collated by EMBL in Europe, the NCBI in the USA and the DDBJ in Japan. In addition, the NBRF in the USA also provide a database of nucleic acid and protein sequences. The databases are distributed on CD-ROM by EMBL, NCBI and NBRF organisations and if you require a local database to scan, this is the preferred method of obtaining it. Some of the database distributions include software for searching the databases (e.g. NBRF-ATLAS program). The disks are normally updated every three months but since over 1,000 new protein sequences are deposited per month, even the current disk is out of date as soon as it arrives! To overcome this problem, the database providers also maintain daily or weekly updates to the databases since the last CD release. If searching with a newly determined sequence one should ideally scan a database that includes all available sequences up to today and if nothing is found, periodically rescan the updated database. Maintaining the regular updates of the sequence databases is usually beyond the scope of an individual investigator, however major data centres do maintain such updated databases and software for searching them. Indeed, providing you have e-mail access to the Internet and are prepared to accept the scanning tools provided by the database centre, then there is no compelling reason for maintaining the databases locally. However, while network access to a database may provide the most up to date version of the data, it does not necessarily give the most effective scanning method for your sequence.

Next: Searching with dynamic Up: Database scanning Previous: Time considerations

geoff.barton@ox.ac.uk