[Jalview-discuss] Conventions for Graphical representation for protein annotations

Jim Procter foreveremain at gmail.com
Mon Feb 28 13:41:36 GMT 2011

Hi Leyla - This is a really good question :) 

On 28/02/2011 13:13, Leyla Garcia wrote:
> I need to know if there is a key of colours and shapes for the graphical 
> representation of annotations in proteins. for instance, if I need to 
> have a pictorial representation of a "domain" or "transcript" then is 
> there a standardized way to do it? So far I have seen that domains are 
> usually represented as ellipses or rectangles, and metal bindings as 
> non-filled circles, while active sites are red-filled circles.
> I am particular interested in the next type of annotations:
> Domain, Signal, Transit, Propeptide, Peptide, Topological domain, 
> Intramembrane, Transmenbrane for ranges of sequences, and Metal binding, 
> Active site, Modified residue, Lipidation, Glycosilation for point 
> positions.
I don't know of any single widely accepted notation for this kind of
annotation, only of the conventions defined by certain programs and

Off the top of my head, however (deep breath) ... Single position
annotation conventions for proteins have mostly been developed by the
structural community for marking up sequences (e.g. LigPlot, PDBsum,
etc) but several have come from sequence databases like uniprot. Domains
are far more varied - and I think there are at least a couple of schools
here (CDART, SMART and Interpro/Pfam) - in particular, colours tend to
be used to distinguish different domains in a diagram, rather than to
attribute the exact type of domain. I doubt there is any real standard
for PTM/chemically modified residues - except for certain types of
amidoglycans (often, here, the important aspect is to distinguish mono,
bi and triantennary sugar chains, with or without phosphoglycan). In the
genome community, I'm sure that the standards are again stemming from
the types of databases/tools people use (ensembl vs Entrez vs Apollo vs
UCSC Genome Browser), as well as the kind of information relevant for a
particular organism (e.g. bacterial/plant/animal type genome
architecture). I could go on with this ramble - but instead, I'd love to
hear if any one else knows of any accepted standards ! 


ps. Leyla - I'll be at the DAS developer's meeting this thursday and
Friday if you are interested in talking about this in person.

