next up previous
Next: Adding new chains Up: New and old data Previous: New and old data

Identifying new PDB files and backup

When a primary database is updated, new data are deposited and old data removed. These changes must be reflected by a derived database. Relationships between objects in the derived database complicate the alterations that must be made. Our primary database, the PDB, changes rapidly, with approximately 200 new structures deposited each month and a handful removed. Identifying PDB files that are new, modified or to be deleted from the existing version of 3Dee is a straightforward task. However, any changes must be transmitted down all levels of data representation in 3Dee. For example, as much of the data in 3Dee involves relationships between different PDB files, information in new PDB files must be added to the existing hierarchy. At the same time, data concerning PDB files that are common to the existing version of 3Dee and the new version of the PDB must be preserved, in order to avoid time consuming data regeneration steps. In the first stage of a database update, the old version of 3Dee is backed up to disk. To save disk space, a complete copy of the PDB and derived data is not made. Only PDB files that are to be deleted from the main area of 3Dee are stored. Data related to old PDB files (i.e. PDBC data, PDBSEQ data, UNIQUE data, secondary structure definition files, domain definition file) are removed from the main area of the database and stored in backup directories. Domain families are also backed up. Together, this information is sufficient to allow an old version of the database to be regenerated. Some PDB files are present in both, the new and old version of 3Dee, but have been modified since the last version of the PDB was downloaded. The modified files are compared with the originals to determine whether the changes in them are major (changes to the structure data) or minor (changes to comments only). If the changes are minor, the files are copied into the main area of the database, but treated as unchanged PDB files. Major changes require the PDB file to be treated as if it is a new file. In both cases, information pertaining the older version is copied to the backup.
next up previous
Next: Adding new chains Up: New and old data Previous: New and old data
Uwe Dengler, 2000-10-16