Next: Figure Legends Up: No Title Previous: Guidelines for Database

Summary

In the early years of sequence searching, only a few specialised centres had access to the necessary computing facilities and programming expertise to perform the scans. In the early-mid 1980s, the availability of personal computers and software that could perform useful analyses on them (e.g. FASTP) meant that it was normally most efficient for searches to be performed locally. Today, the optimum choice is again swinging towards databases maintained at a few centres, but now fast networks and windowing workstations allow the user to use software locally and be unaware that the search is being carried out on a computer in another country. Perhaps the best example of this to date is the Entrez software [69] available from the U.S. NCBI (Ask for information from info@ncbi.nlm.nih.gov). Entrez provides a windowing interface to a database that integrates the nucleotide and protein sequence databases with associated references and abstracts. Entrez will either use the database on CD-ROM or alternatively, with suitable network connection can interrogate the master database at the NCBI in Washington. While Entrez does not provide searching facilities for a new sequence it stores pre-computed similarities between pairs of sequences in the database. Thus, one can quickly navigate between a protein name, the sequence, its close homologues, the corresponding DNA sequence and all relevant publications. Network Entrez was heavily used when compiling this Chapter!

The advantages of centralised databases for the user are:

The drawback with a centralised service is that one has to accept the service providers view of the best way to perform the search. However, with more database centres giving public access to search facilities every year there is an increasing choice of algorithms available.



Next: Figure Legends Up: No Title Previous: Guidelines for Database


geoff.barton@ox.ac.uk