public class ConservationCalculator
extends java.lang.Object
List<FastaSequence> sequences = CmdParser
.openInputStream(<PATH_TO_INPUT_FILE>);
2) The boolean parameters telling the system whether the results should be
normalized or not. Normalized results have values between 0 and 1. Please
note however, that some results cannot be normalized. In such a case, the
system returns not normalized values, and log the issue to the standard error
stream. The following formula is used for normalization n = (d - dmin)/(dmax
- dmin) Negative results first converted to positive by adding an absolute
value of the most negative result.
int corenum = Runtime.getRuntime().availableProcessors();
ExecutorService executor = Executors.newFixedThreadPool(corenum);
........DO CALCULATIONS.........
// shutdown the Executor
executor.shutdown();
Please take care to initialize and pass only one executor to all the methods
to avoid the waist of resources. After use, the executor must be disposed of,
it can be done as follows: executor.shutdown();
An example use of this class for calculating conservation is below:
// Determine the number of CPU cores available on the system.
int corenum = Runtime.getRuntime().availableProcessors();
// Initialize the Executor instance with a number of cores
ExecutorService executor = Executors.newFixedThreadPool(corenum);
// Load the data from the file containing either Clustal formatted alignment
// or a list of FASTA formatted sequences
List<FastaSequence> sequences = CmdParser
.openInputStream("test/data/small.align");
// Calculate conservation scores using all the methods available.
Map<Method, double[]> result = getConservation(sequences, true,
EnumSet.allOf(ConservationMethod.class), executor);
// Print the results to the console.
ConservationFormatter.outputScoreLine(result, System.out);
Constructor and Description |
---|
ConservationCalculator() |
Modifier and Type | Method and Description |
---|---|
static java.util.Map<compbio.data.sequence.ConservationMethod,double[]> |
getConservation(java.util.List<compbio.data.sequence.FastaSequence> alignment,
boolean normalize,
java.util.Set<compbio.data.sequence.ConservationMethod> methods,
java.util.concurrent.ExecutorService executor)
Calculates the conservation by all the methods defined by
ConservationMethod enumeration apart from SMERFS. |
static double[] |
getSMERFSScore(compbio.data.sequence.Alignment alignment,
int windowWidth,
compbio.data.sequence.SMERFSConstraints scoringMethod,
float gapTreshold,
boolean normalize,
java.util.concurrent.ExecutorService service)
Calculating the SMERFS score with custom parameters
|
public static java.util.Map<compbio.data.sequence.ConservationMethod,double[]> getConservation(java.util.List<compbio.data.sequence.FastaSequence> alignment, boolean normalize, java.util.Set<compbio.data.sequence.ConservationMethod> methods, java.util.concurrent.ExecutorService executor) throws java.lang.InterruptedException
ConservationMethod
enumeration apart from SMERFS.alignment
- the list of FastaSequence objects holding the alignment. All
sequences must be of the same length.normalize
- true if the resulting scores should be normalized, false
otherwise.methods
- the methods to be used for the calculation, all
ConservationMethod
but SMERFS can be used. The
EnumSet
class provides a number of convenience methods
which can be used to prepare a set of methods for the input.
For example, to use ConservationMethod.KABAT
for
conservation calculation one could construct a set in the
following way: EnumSet.of(ConservationMethod.KABAT)
.EnumSet.complementOf(EnumSet.of(ConservationMethod.SMERFS))
EnumSet.range(ConservationMethod.KABAT, ConservationMethod.GERSTEIN)
EnumSet
executor
- the ExecutorService
to be used to parallelize the
calculationsjava.lang.InterruptedException
- if the calculating Thread was interruptedpublic static double[] getSMERFSScore(compbio.data.sequence.Alignment alignment, int windowWidth, compbio.data.sequence.SMERFSConstraints scoringMethod, float gapTreshold, boolean normalize, java.util.concurrent.ExecutorService service)
alignment
- the List of FastaSequence objects holding each sequence from
the alignmentwindowWidth
- the window size parameter for SMERFS algorithmscoringMethod
- the SMERFSConstraints
gapTreshold
- the gap threshold for SMERFS algorithmnormalize
- the boolean value indicating whether the resulting score
should be normalized, true if it does.service
- the ExecutorService
to be used to parallel
calculations