                       IUPred RELEASE NOTES
                       ====================

IUPred Version 1.0

by Zsuzsanna Dosztanyi


IUPred is supplied in source code form along with the required data files. The
program is written in ANSI C and the code should compile on any ANSI C compiler
e.g. the GNU C compiler. To be able to run outside the source directory, the
IUPred_PATH  environment variable has to be set to the location of the source
files. 

TO COMPILE: 

cc iupred.c -o iupred 

TO RUN IUPred:

iupred seqfile type
  
  where seqfile is the name of the sequence file
  
  type is any of the option of 

  	long
	short 
	glob
	
  for prediction of long disorder, short disorder ( e.g. missing residues in
  X-ray structures) or predicting globular domains. 


INPUT FILE: sequence_file in fasta format. One sequence per file.

EXAMPLE RUN: 

iupred P53_HUMAN.seq long


INTERPRETATION OF THE OUTPUT:

In the case of long and short types of disorder the output  gives the
likelihood of disorder for each residue, i.e. it is a value between 0 and 1,
and higher values indicate higher probability of disorder. Residues with values
above 0.5 can be regarded as disordered, and at this cutoff 5% of globular
proteins is expected to be predicted to disordered (false positives).
 
For the prediction type of globular domains it gives the number of globular
domains and list their start and end position in the sequence. This is followed
by the submitted sequence with residues of globular domains indicated by
uppercase letters. 


Please see the LICENSE file for the license terms for the software. It is
basically free for academic users, but a license fee applies to commercial
users. 

THE PUBLICATION OF RESEARCH USING IUPred MUST INCLUDE AN APPROPRIATE
CITATION TO THE METHOD:

The Pairwise Energy Content Estimated from Amino Acid Composition Discriminates 
between Folded and Intrinsically Unstructured Proteins
Zsuzsanna Dosztnyi, Veronika Csizmk, Pter Tompa and Istvn Simon
J. Mol. Biol. (2005) 347, 827-839.


SHORT SUMMARY OF THE METHOD

Intrinsically unstructured/disordered proteins have no single well-defined
tertiary structure in their native, functional state. Our server recognizes
such regions from the amino acid sequence based on the estimated pairwise
energy content. The underlying assumption is that globular proteins make a
large number of interresidue interactions, providing the stabilizing energy to
overcome the entropy loss during folding. In contrast, IUPs have special
sequences that do not have the capacity to form sufficient interresidue
interactions. Taking a set of globular proteins with known structure, we have
developed a simple formalism that allows the estimation of the pairwise
interaction energies of these proteins. It uses a quadratic expression in the
amino acid composition, which takes into account that the contribution of an
amino acid to order/disorder depends not only its own chemical type, but also
on its sequential environment, including its potential interaction partners.
Applying this calculation for IUP sequences, their estimated energies are
clearly shifted towards less favorable energies compared to globular proteins,
enabling the predicion of protein disorder on this ground. 
