knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
genecovr
is an R
package that provides plotting functions that
summarize gene transcript to genome alignments. The main purpose is to
assess the effect of polishing and scaffolding operations has on the
quality of a genome assembly. The gene transcript set is a large
sequence set consisting of assembled transcripts from RNA-seq data
generated in relation to a genome assembly project. Therefore,
genecovr
serves as a complement to software such as
BUSCO, which evaluates genome assembly
quality using a smaller set of well-defined single-copy orthologs.
You can install the released version of genecovr from NBIS GitHub with:
# If necessary, uncomment to install devtools # install.packages("devtools") devtools::install_github("NBISweden/genecovr")
The tool has been developed and tested on GNU/Linux systems but should
work on any system that runs R
. Installation is expected to take at
most a couple of minutes.
There is a helper script for generating basic plots located in PACKAGE_DIR/bin/genecovr. Create a data input csv-delimited file with columns
Columns 3 and 4 can be set to missing value (NA) in which case sequence sizes will be inferred from the alignment files. Then run the script to generate plots:
PACKAGE_DIR/bin/genecovr indata.csv
There are example files located in PACKAGE_DIR/inst/extdata consisting of two psl alignment files containing gmap alignments and fasta indices for the transcript sequences and two for different assembly versions:
Using these files and the labels non
and pol
for the different
assemblies, a genecovr
input file (called e.g., assemblies.csv
)
would look as follows:
nonpol,transcripts2nonpolished.psl,nonpolished.fai,transcripts.fai pol,transcripts2polished.psl,polished.fai,transcripts.fai
and the command to run would be:
genecovr assemblies.csv
To list genecovr script options, type 'genecovr -h`:
usage: genecovr [-h] [-v] [-p number] [-d OUTPUT_DIRECTORY] [--height HEIGHT] [--width WIDTH] csvfile positional arguments: csvfile csv-delimited file with columns 1. data label 2. mapping file (supported formats: psl) 3. assembly file (fasta or fasta index) 4. transcript file (fasta or fasta index) optional arguments: -h, --help show this help message and exit -v, --verbose print extra output -p number, --cpus number number of cpus [default 1] -d OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY output directory --height HEIGHT figure height in inches [default 6.0] --width WIDTH figure width in inches [default 6.0]
Alternatively, import the library in an R script and use the package
functions. See Get started or run
vignette("genecovr")
for a minimum working example.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.