Group Software and Databases

Scroll down this page or jump to sections focused on:

Multiple Sequence Alignment and Analysis Protein Structure and Prediction Sequence Analysis Utilities and Legacy Databases

The Barton Group on GitHub

In addition to the links below, the Group also releases software through GitHub. See the GitHub account for Geoff Barton's Computational Biology Group

Jalview has its own Git repository together with mechanism for reporting bugs and feature requests. Please see the Jalview Development Pages for details.

Dundee Resource for Sequence Analysis and Structure Prediction (DRSASP)

The Dundee Resource for Sequence Analysis and Structure Prediction is a collection of services provided by the Barton Group. The Dundee Resource includes services such as Jpred, JABAWS, AACon, and others currently under development.

Multiple Sequence Alignment and Analysis

Since publishing one of the first practical multiple protein sequence alignment algorithms in 1987 we have focused on methods to analyse and interpret alignments in order to understand the structure and function of proteins and other macromolecules.

Jalview

Jalview Jalview is an interactive multiple sequence alignment analysis workbench.

Jalview allows you to create, view, edit and annotate protein and nucleic acid alignments and make predictions of secondary structure and other features. Go to Jalview's" own website to learn more about Jalview, to download it and to contribute to its open-source development.

Go to Jalview Website!

Alscript

Alscript output image Alscript is a program to format multiple sequence alignments in PostScript for publication and to assist in analysis.

Alscript does not support point-and-click, but has a scripting language to allow complex effects. The ALSCRIPT User Guide (Version 2.0) explains what Alscript does and how to use it.

For details see: Barton, 1993 and the manual.

Although some of the visualisation effects in Alscript can now be achieved by Jalview. Alscript is still widely used since it offers a lot of flexibility in how you can format and annotate alignments. The AMAS program and JPred secondary structure prediction server both provide output options generated by Alscript.

Get Alscript

Jabaws

Jabaws Simple Logo Jabaws or 'Jaba' is a package that includes six different multiple sequence alignment programs, four protein disorder prediction programs and other utilities for sequence analysis and makes them available over the web using 'webservices' technology.

The methods in Jabaws are available from within Jalview or can be run on the command-line. Jabaws is easy to install on your workstation, laptop or institutional computer cluster.

Go to Jabaws website

AMAS

AMAS output image Performs sub-family analysis on a multiple protein sequence alignment to identify functionally important residues and produces pretty output via Alscript.

The AMAS server was the first ever webserver from the Group in 1994 and one of the first in the UK. We've kept the look 'retro', but don't be put off by that - it still allows you to analyse your protein family in unique ways!

You can Download AMAS or read its manual. Alternatively, just use the server:

Go to the AMAS server

AMPS

AMPS simple logo The AMPS (Alignment of Multiple Protein Sequences) package is a suite of programs for protein multiple sequence alignment, pairwise alignment, statistical analysis and flexible pattern matching.

It is here really for historical interest since it was one of the first practical multiple alignment methods and many of the ideas tried out in this package are now standard in more modern multiple alignment methods.

AMPS was the first alignment program to implement variable gap penalties to capture the fact that gaps are not uniformly distributed in protein families. It also introduced "flexible pattern matching" which allows for a set of profiles to be interlinked by defined gap ranges.

You can read the AMPS Manual or explore the ideas in AMPS in the following papers:

Barton, G. J. (1990), [Review] "Protein Multiple Sequence Alignment and Flexible Pattern Matching", Meth. Enzymol., 183, 403-428.
Barton, G. J. and Sternberg, (1987), "A Strategy for the Rapid Multiple Alignment of Protein Sequences: Confidence Levels From Tertiary Structure Comparisons", J. Mol. Biol., 198, 327-337.
Barton, G. J. and Sternberg, (1987), "Evaluation and Improvements in the Automatic Alignment of Protein Sequences", Protein Engineering, 1, 89-94.
Barton, G. J. and Sternberg, M. J. E., (1990), "Flexible Protein Sequence Patterns - A Sensitive Method to Detect Weak Structural Similarities", J. Mol. Biol., 212, 389-402.

Get AMPS

OxBench

Oxbench alignment comparison OxBench is a suite of programs to assess the accuracy of multiple sequence alignment methods.

OxBench includes a reference database of protein multiple sequence alignments that were generated by consideration of protein three-dimensional structure. OxBench is aimed at developers of alignment methods rather than end-users.

A key feature in OxBench is how it chooses which regions of the supplied alignment files to calculate accuracy from and this is not entirely straighforward to implement. If you do choose to use OxBench to evaluate an alignment method, please read the paper and use our code and R-scripts.

If you use OxBench, please cite:

Raghava, G. P. S., Searle, S. M. J., Audley, P. C, Barber, J. D. and Barton, G. J. (2003), BMC Bioinformatics, 4:47 "OXBench: A benchmark for evaluation of protein multiple sequence alignment accuracy".

Get OxBench

Protein Structure and Prediction

Relating sequence to structure to function is core to our research and tools such as Jalview help to do this. Other tools and databases such as those listed here are more specifically aimed at non-primary structure.

LIGYSIS

Ligysis Web server for the analysis of ligand binding sites.

LIGYSIS brings together the 3D structure of all ligands that are bound in complex to a protein. It classifies each binding site by evolutionary conservation, human genetic variation (based on homologues where the protein is not human) and likely function and presents all this in a convenient web interface.

LIGYSIS also allows you to upload your own set of protein-ligand complexes for analysis by the same methods

Go to LIGYSIS Server

JPred

JPred Web server for protein secondary structure prediction, buried site prediction and coiled coils prediction.

JPred is can also be run on a single sequence or multiple alignment from within the Jalview multiple alignment workbench.

Go to JPred Server

PIPs

PIPs Logo Protein-Protein Interaction Predictions for all human proteins obtained by combination of multiple information sources in a Bayesian framework.

Go to the PIPs Database

STAMP

STAMP Simple Logo Software for multiple alignment of protein 3D structures.

Produces a structure-based sequence alignment with confidence values for each aligned position. Also produces hierarchically arranged structure superpositions of the proteins. Can also be used to search a protein structure against a database of structures. You can read the stamp manual to get an idea of its features.

Get STAMP

DOMAK

3Dee/DOMAK DOMAK analyses protein three-dimensional structures to identify the location of probable compact domains. You can read a bit more about DOMAK in its manual.

DOMAK was developed to speed up the classification of protein domains in protein structure. The 3Dee database was built from DOMAK definitions followed by manual curation. Although 3Dee is now no longer maintained, DOMAK still forms part of the domain definition pipeline in the CATH database at UCL.

Get DOMAK

Sequence Analysis

Many of our tools and databases analyse or organise sequences of proteins or nucleic acids. They range from polyAdb a system to simplify access to RNAseq and DRS transcriptomics data, to Kinomer which classifies protein kinases.

14-3-3-Pred

1433pred 14-3-3-Pred is a web resource that predicts the location of 14-3-3 binding sites.

14-3-3-Pred provides a simple yet useful interface to the new methods developed in the Barton Group to score potential Ser/Thr centred motifs for likelihood of binding 14-3-3 proteins

Go to the 1433Pred website for more information

AACon

aacon AACon is a web resource for Amino Acid Conservation Calculation.

AACon is a set of tools implementing 17 different conservation scores reviewed by Valdar as well as the more complex SMERFS algorithm for predicting protein functional sites.

Go to the AACon website for more information

NOD

NOD logo Software to predict nucleolar localisation sequences from the amino acid sequence.

The predictor can be downloaded or run on-line.

Go to the NOD Server

Kinomer

Kinomer logo Kinomer is a library of HMMs and database of protein kinases organised into kinase classes.

Kinomer is built by application of a hierarchy of HMMs trained on different subsets of protein kinase sequences.

Go to Kinomer

Xtal

OBScore logo XANNpred logo Xtal is a collection of three methods: the OB-Score, ParCrys and XANNPred that predict the likelihood of a protein succeeding in a crystallisation experiment.

The three methods are all accessible from the Xtal website and can be run on any protein sequence. Alternatively, run them and many other methods with the TarO analysis pipeline.

Go to the XTal webserver

GOtcha

GOtcha predicts the Gene Ontology (GO) functional class for a protein sequence.

The GOtcha website also includes analyses carried out using the method on a collection of proteomes.

Go to GOtcha

Utilities and Legacy Databases

OC

OC logo OC is a cluster analysis program.

OC implements single, complete and means linkage cluster analysis. It does not have software limits on the number of entities that can be clustered.

OC was developed to cluster large sets of protein sequences, but it is general and can be applied to any type of data.

Version 2.1(February 2004) includes more options for postscript page layout.
Version 2.0 (August 2002) includes functions gjnoc and gjnoc2 that allow the OC software to be called from a C program.

OC does not have a formal publication, but if you use it in your work, please cite it as:
Barton, G.J. (1993, 2002, 2004) "OC - A cluster analysis program", University of Dundee, UK. You can also read the OC manual or download the souce by following the link below.

Go to OC download page

Legacy Databases

Databases take a lot of work to develop, but even more to maintain long-term. Unlike a scientific paper, which once published is "finished", a database requires updating as new data and techniques are developed. Unfortunately, the time resources to do this cannot always be justified so some databases from the Group are either pensioned off, or simply kept in a static form.

SNAPPI is a database of interactions between protein structural domains. Although currently not maintained, there is a strong possibility it will be revived...

Go to SNAPPI

3Dee is a database of protein structural domains organised into a hierarchy by structural similarity. It was developed in 1993/4 when there were no databases of protein structural domains to meet the need to organise the protein data bank (PDB) in a way that would simplify large-scale analysis. 3Dee includes both continuous and discontinuous domain definitions as well as cross-references to the SCOP database which was developed around the same time. 3Dee was last updated in 1998 and won't be developed further since there are good alternatives today in SCOP and CATH.

Go to 3Dee

TarO Target Optimisation Utility. Was an easy to use sequence analysis pipeline for protein target selection and more.

TarO comprehensively annotated a protein sequence to help identify domains suitable for crystallisation experiments or simply to understand the protein in more detail. After running for some years it was retired in July 2017. However, some elements of TarO will be revived as part of the

You can read the paper describing TarO TarO: a target optimisation system for structural biology by Overton et al to see what it did and the reasoning behind it.

If you press this button it takes you to the error page for TarO! Go to TarO

PolyADB was a database to gather together results from our longstanding collaboration with Gordon Simpson's group. However, it became unweildly to maintain and our priorities for the bioinformatics in the collaboration moved elsehwere. This link, may still work or perhaps will take you to a "not found" message.

ngSeqUtils - utility scripts in Python

ngSeqUtils is a set of utility scripts developed by Nick Schurch.

This is a small collection of scripts to assist data analysis of Next Generation Sequencing data and handle wig, bigwig, and gff files in python. They supplement the excellent tools available in Biopython and Pycogent. Currently there are several scripts and command-line parsing, logging, and wig/bigwig/gff file parsing modules. Enjoy!

Go to ngSeqUtils download page