Important note:
Execution of this program requires a successful run of GeneMark-ES for your genome.
Running GeneMark-ES is only required once for each genome.
Requirements:
- samtools in your path.
- Perl package Parallel::ForkManager - this file is included if you wish to manually install it,
however it may be simpler to use CPAN
Installation:
Extract the tarball on your system, then run the CSH script 'install.csh' residing in
the folder /src. This will build
all of the source code and move the programs to /bin. Make sure all files in /bin have
execute priveleges. Now just add the /bin directory to your path and you are all set.
Running:
- unsplicer_pair.pl (to align paired-end reads)
- unsplicer_single.pl (for single-end reads)
Note: if /es is the location of
a completed run of GeneMark-ES for your genome, then the model directory for UnSplicer will be /es/mod, and the
gene predictions will be /es/pred_orig_name.gff (assuming you have mapped the predictions back to
chromosome or scaffold coordinates by running 'es.pl -mapback' in the /es directory).
No files other than the model file should be located in /es/mod.
Sample GeneMark-ES parameter files:
These model files can be used with UnSplicer for the associated genome. They were used
to generate the results shown in the UnSplicer publication.
Ab initio predictions have been made using these models on the reference assemblies (2nd column in the table below).
If your genome assembly is a different version, you can create new predictions
by following these instructions:
- place the downloaded model file in a folder called
mod
- rename the file to
model.0mtx
(IMPORTANT!)
- run the prediction step of GeneMark-ES (
es.pl -pred -mapback dna.fna
in the parent
folder of mod
, where dna.fna is the reference sequence assembly)
- the folder
mod
and predictions pred_orig_name.gff
will be given to UnSplicer as input
(in addition the read sequences and reference assembly file)