Since publishing one of the first practical multiple protein sequence alignment algorithms in 1987 we have focused on methods to analyse and interpret alignments in order to understand the structure and function of proteins and other macromolecules.
Jalview is an interactive multiple sequence alignment analysis workbench.
Jalview allows you to create, view, edit and annotate protein and nucleic acid alignments and make predictions of secondary structure and other features. Go to Jalview's" own website to learn more about Jalview, to download it and to contribute to its open-source development.
Alscript is a program to format multiple sequence alignments in PostScript for publication and to assist in analysis.
Alscript does not support point-and-click, but has a scripting language to allow complex effects. The ALSCRIPT User Guide (Version 2.0) explains what Alscript does and how to use it.
Although some of the visualisation effects in Alscript can now be achieved by Jalview. Alscript is still widely used since it offers a lot of flexibility in how you can format and annotate alignments. The AMAS program and JPred secondary structure prediction server both provide output options generated by Alscript.
Jabaws or 'Jaba' is a package that includes six different multiple sequence alignment programs, four protein disorder prediction programs and other utilities for sequence analysis and makes them available over the web using 'webservices' technology.
The methods in Jabaws are available from within Jalview or can be run on the command-line. Jabaws is easy to install on your workstation, laptop or institutional computer cluster.
Performs sub-family analysis on a multiple protein sequence alignment to identify functionally important residues and produces pretty output via Alscript.
The AMAS server was the first ever webserver from the Group in 1994 and one of the first in the UK. We've kept the look 'retro', but don't be put off by that - it still allows you to analyse your protein family in unique ways!
The AMPS (Alignment of Multiple Protein Sequences) package is a suite of programs for protein multiple sequence alignment, pairwise alignment, statistical analysis and flexible pattern matching.
It is here really for historical interest since it was one of the first practical multiple alignment methods and many of the ideas tried out in this package are now standard in more modern multiple alignment methods.
AMPS was the first alignment program to implement variable gap penalties to capture the fact that gaps are not uniformly distributed in protein families. It also introduced "flexible pattern matching" which allows for a set of profiles to be interlinked by defined gap ranges.You can read the AMPS Manual or explore the ideas in AMPS in the following papers:
OxBench is a suite of programs to assess the accuracy of multiple sequence alignment methods.
OxBench includes a reference database of protein multiple sequence alignments that were generated by consideration of protein three-dimensional structure. OxBench is aimed at developers of alignment methods rather than end-users.
A key feature in OxBench is how it chooses which regions of the supplied alignment files to calculate accuracy from and this is not entirely straighforward to implement. If you do choose to use OxBench to evaluate an alignment method, please read the paper and use our code and R-scripts.
If you use OxBench, please cite:
Relating sequence to structure to function is core to our research and tools such as Jalviewhelp to do this. Other tools and databases such as those listed here are more specifically aimed at non-primary structure.
Web server for protein secondary structure prediction, buried site prediction and coiled coils prediction.
JPred is can also be run on a single sequence or multiple alignment from within the Jalview multiple alignment workbench.
Protein-Protein Interaction Predictions for all human proteins obtained by combination of multiple information sources in a Bayesian framework.
Software for multiple alignment of protein 3D structures.
Produces a structure-based sequence alignment with confidence values for each aligned position. Also produces hierarchically arranged structure superpositions of the proteins. Can also be used to search a protein structure against a database of structures. You can read the stamp manual to get an idea of its features.
DOMAK analyses protein three-dimensional structures to identify the location of probable compact domains. You can read a bit more about DOMAK in its manual.
DOMAK was developed to speed up the classification of protein domains in protein structure. The 3Dee database was built from DOMAK definitions followed by manual curation. Although 3Dee is now no longer maintained, DOMAK still forms part of the domain definition pipeline in the CATH database at UCL.
Many of our tools and databases analyse or organise sequences of proteins or nucleic acids. They range from polyAdb a system to simplify access to RNAseq and DRS transcriptomics data, to TarO which comprehensively annotates a protein sequence to help identify domains suitable for crystallisation experiments.
1433Pred is a web resource that predicts the location of 14-3-3 binding sites.Go to the 1433Pred website for more information
Web resource that organses data about the location of polyadenylation sites in a genome under different conditions and different species.
polyAdb provides access to the sequencing reads as well as one-click visualisation in the IGB genome browser. Much of the data are from third generation Direct RNA Sequencing rather than Illumina RNAseq data.Go to the polyAdb website
Software to predict nucleolar localisation sequences from the amino acid sequence.
The predictor can be downloaded or run on-line.
Kinomer is a library of HMMs and database of protein kinases organised into kinase classes.
Kinomer is built by application of a hierarchy of HMMs trained on different subsets of protein kinase sequences.
Target Optimisation Utility. Easy to use sequence analysis pipeline for protein target selection and more.
Comprehensively annotate a protein sequence to help identify domains suitable for crystallisation experiments or simply to understand the protein in more detail.
The three methods are all accessible from the Xtal website and can be run on any protein sequence. Alternatively, run them and many other methods with the TarO analysis pipeline.
GOtcha predicts the Gene Ontology (GO) functional class for a protein sequence.
The GOtcha website also includes analyses carried out using the method on a collection of proteomes.
OC is a cluster analysis program.
OC implements single, complete and means linkage cluster analysis. It does not have software limits on the number of entities that can be clustered.
OC was developed to cluster large sets of protein sequences, but it is general and can be applied to any type of data.
Version 2.1(February 2004) includes more options for postscript page layout.
Version 2.0 (August 2002) includes functions gjnoc and gjnoc2 that allow the OC software to be called from a C program.
OC does not have a formal publication, but if you use it in your work, please cite it as:
Barton, G.J. (1993, 2002, 2004) "OC - A cluster analysis program", University of Dundee, UK. You can also read the OC manual or download the souce by following the link below.
Databases take a lot of work to develop, but even more to maintain long-term. Unlike a scientific paper, which once published is "finished", a database requires updating as new data and techniques are developed. Unfortunately, the time resources to do this cannot always be justified so some databases from the Group are either pensioned off, or simply kept in a static form.
SNAPPI is a database of interactions between protein structural domains. Although currently not maintained, there is a strong possibility it will be revived...
3Dee is a database of protein structural domains organised into a hierarchy by structural similarity. It was developed in 1993/4 when there were no databases of protein structural domains to meet the need to organise the protein data bank (PDB) in a way that would simplify large-scale analysis. 3Dee includes both continuous and discontinuous domain definitions as well as cross-references to the SCOP database which was developed around the same time. 3Dee was last updated in 1998 and won't be developed further since there are good alternatives today in SCOP and CATH.
ngSeqUtils is a set of utility scripts developed by Nick Schurch.
This is a small collection of scripts to assist data analysis of Next Generation Sequencing data and handle wig, bigwig, and gff files in python. They supplement the excellent tools available in Biopython and Pycogent. Currently there are several scripts and command-line parsing, logging, and wig/bigwig/gff file parsing modules. Enjoy!
This site was built using the Bootstrap framework.