University of Dundee Logo
Barton Group Banner

Group Software and Databases

Scroll down this page or jump to sections focused on:

Multiple Sequence Alignment and Analysis Protein Structure and Prediction Sequence Analysis Utilities and Legacy Databases

Multiple Sequence Alignment and Analysis

Since publishing one of the first practical multiple protein sequence alignment algorithms in 1987 we have focused on methods to analyse and interpret alignments in order to understand the structure and function of proteins and other macromolecules.


Jalview

Jalview Jalview is an interactive multiple sequence alignment analysis workbench.

Jalview allows you to create, view, edit and annotate protein and nucleic acid alignments and make predictions of secondary structure and other features. Go to Jalview's" own website to learn more about Jalview, to download it and to contribute to its open-source development.

Go to Jalview Website!


Alscript

Alscript output image Alscript is a program to format multiple sequence alignments in PostScript for publication and to assist in analysis.

Alscript does not support point-and-click, but has a scripting language to allow complex effects. The ALSCRIPT User Guide (Version 2.0) explains what Alscript does and how to use it.

For details see: Barton, 1993 and the manual.

Although some of the visualisation effects in Alscript can now be achieved by Jalview. Alscript is still widely used since it offers a lot of flexibility in how you can format and annotate alignments. The AMAS program and JPred secondary structure prediction server both provide output options generated by Alscript.

Get Alscript


Jabaws

Jabaws Simple Logo Jabaws or 'Jaba' is a package that includes six different multiple sequence alignment programs, four protein disorder prediction programs and other utilities for sequence analysis and makes them available over the web using 'webservices' technology.

The methods in Jabaws are available from within Jalview or can be run on the command-line. Jabaws is easy to install on your workstation, laptop or institutional computer cluster.

Go to Jabaws website


AMAS

AMAS output image Performs sub-family analysis on a multiple protein sequence alignment to identify functionally important residues and produces pretty output via Alscript.

The AMAS server was the first ever webserver from the Group in 1994 and one of the first in the UK. We've kept the look 'retro', but don't be put off by that - it still allows you to analyse your protein family in unique ways!

You can Download AMAS or read its manual. Alternatively, just use the server:

Go to the AMAS server


AMPS

AMPS simple logoThe AMPS (Alignment of Multiple Protein Sequences) package is a suite of programs for protein multiple sequence alignment, pairwise alignment, statistical analysis and flexible pattern matching.

It is here really for historical interest since it was one of the first practical multiple alignment methods and many of the ideas tried out in this package are now standard in more modern multiple alignment methods.

AMPS was the first alignment program to implement variable gap penalties to capture the fact that gaps are not uniformly distributed in protein families. It also introduced "flexible pattern matching" which allows for a set of profiles to be interlinked by defined gap ranges.

You can read the AMPS Manual or explore the ideas in AMPS in the following papers:

Get AMPS


OxBench

Oxbench alignment comparison OxBench is a suite of programs to assess the accuracy of multiple sequence alignment methods.

OxBench includes a reference database of protein multiple sequence alignments that were generated by consideration of protein three-dimensional structure. OxBench is aimed at developers of alignment methods rather than end-users.

A key feature in OxBench is how it chooses which regions of the supplied alignment files to calculate accuracy from and this is not entirely straighforward to implement. If you do choose to use OxBench to evaluate an alignment method, please read the paper and use our code and R-scripts.

If you use OxBench, please cite:

Get OxBench

Protein Structure and Prediction

Relating sequence to structure to function is core to our research and tools such as Jalviewhelp to do this. Other tools and databases such as those listed here are more specifically aimed at non-primary structure.


JPred

JPred Web server for protein secondary structure prediction, buried site prediction and coiled coils prediction.

JPred is can also be run on a single sequence or multiple alignment from within the Jalview multiple alignment workbench.

Go to JPred Server

PIPs

PIPs Logo Protein-Protein Interaction Predictions for all human proteins obtained by combination of multiple information sources in a Bayesian framework.

Go to the PIPs Database


STAMP

STAMP Simple Logo Software for multiple alignment of protein 3D structures.

Produces a structure-based sequence alignment with confidence values for each aligned position. Also produces hierarchically arranged structure superpositions of the proteins. Can also be used to search a protein structure against a database of structures. You can read the stamp manual to get an idea of its features.

Get STAMP


DOMAK

3Dee/DOMAK DOMAK analyses protein three-dimensional structures to identify the location of probable compact domains. You can read a bit more about DOMAK in its manual.

DOMAK was developed to speed up the classification of protein domains in protein structure. The 3Dee database was built from DOMAK definitions followed by manual curation. Although 3Dee is now no longer maintained, DOMAK still forms part of the domain definition pipeline in the CATH database at UCL.

Get DOMAK

Sequence Analysis

Many of our tools and databases analyse or organise sequences of proteins or nucleic acids. They range from polyAdb a system to simplify access to RNAseq and DRS transcriptomics data, to TarO which comprehensively annotates a protein sequence to help identify domains suitable for crystallisation experiments.


1433Pred

1433Pred is a web resource that predicts the location of 14-3-3 binding sites.

Go to the 1433Pred website for more information


polyAdb

polyAdb Web resource that organses data about the location of polyadenylation sites in a genome under different conditions and different species.

polyAdb provides access to the sequencing reads as well as one-click visualisation in the IGB genome browser. Much of the data are from third generation Direct RNA Sequencing rather than Illumina RNAseq data.

Go to the polyAdb website


NOD

NOD logo Software to predict nucleolar localisation sequences from the amino acid sequence.

The predictor can be downloaded or run on-line.

Go to the NOD Server


Kinomer

Kinomer logo Kinomer is a library of HMMs and database of protein kinases organised into kinase classes.

Kinomer is built by application of a hierarchy of HMMs trained on different subsets of protein kinase sequences.

Go to Kinomer


TarO

TarO Target Optimisation Utility. Easy to use sequence analysis pipeline for protein target selection and more.

Comprehensively annotate a protein sequence to help identify domains suitable for crystallisation experiments or simply to understand the protein in more detail.

Go to TarO


Xtal

OBScore logo XANNpred logo Xtal is a collection of three methods: the OB-Score, ParCrys and XANNPred that predict the likelihood of a protein succeeding in a crystallisation experiment.

The three methods are all accessible from the Xtal website and can be run on any protein sequence. Alternatively, run them and many other methods with the TarO analysis pipeline.

Go to the XTal webserver


GOtcha

TarO GOtcha predicts the Gene Ontology (GO) functional class for a protein sequence.

The GOtcha website also includes analyses carried out using the method on a collection of proteomes.

Go to GOtcha

Utilities and Legacy Databases


OC

OC logo OC is a cluster analysis program.

OC implements single, complete and means linkage cluster analysis. It does not have software limits on the number of entities that can be clustered.

OC was developed to cluster large sets of protein sequences, but it is general and can be applied to any type of data.

Version 2.1(February 2004) includes more options for postscript page layout.
Version 2.0 (August 2002) includes functions gjnoc and gjnoc2 that allow the OC software to be called from a C program.

OC does not have a formal publication, but if you use it in your work, please cite it as:
Barton, G.J. (1993, 2002, 2004) "OC - A cluster analysis program", University of Dundee, UK. You can also read the OC manual or download the souce by following the link below.

Go to OC download page


Legacy Databases

Databases take a lot of work to develop, but even more to maintain long-term. Unlike a scientific paper, which once published is "finished", a database requires updating as new data and techniques are developed. Unfortunately, the time resources to do this cannot always be justified so some databases from the Group are either pensioned off, or simply kept in a static form.


SNAPPI is a database of interactions between protein structural domains. Although currently not maintained, there is a strong possibility it will be revived...

Go to SNAPPI


3Dee is a database of protein structural domains organised into a hierarchy by structural similarity. It was developed in 1993/4 when there were no databases of protein structural domains to meet the need to organise the protein data bank (PDB) in a way that would simplify large-scale analysis. 3Dee includes both continuous and discontinuous domain definitions as well as cross-references to the SCOP database which was developed around the same time. 3Dee was last updated in 1998 and won't be developed further since there are good alternatives today in SCOP and CATH.

Go to 3Dee


ngSeqUtils - utility scripts in Python

ngSeqUtils is a set of utility scripts developed by Nick Schurch.

This is a small collection of scripts to assist data analysis of Next Generation Sequencing data and handle wig, bigwig, and gff files in python. They supplement the excellent tools available in Biopython and Pycogent. Currently there are several scripts and command-line parsing, logging, and wig/bigwig/gff file parsing modules. Enjoy!

Go to ngSeqUtils download page




This site was built using the Bootstrap framework.