View source: R/plot_outlier_ideogram.R
plot_outlier_ideogram | R Documentation |
For this function to work, you must have a closely related reference genome and must run minimap2 externally (tested with v2.17). minimap2 maps assembly scaffolds to a reference genome. This includes all the unplaced assembly scaffolds as well as the transcriptome combined into one assembly fasta file. minimap2 should be run with the –cs and -N 100 options with default PAF file output format. You will need to adjust the asm value to suit how closely related your reference genome is (i.e. sequence divergence). See the minimap2 documentation for more info: https://github.com/lh3/minimap2. Then, you need to run PAFScaff on the PAF file output from minimap2 (https://github.com/slimsuite/pafscaff). PAFScaff parses the minimap2 output and improves the mapping. The scaffolds.tdt.file value is the *.scaffolds.tdt output file from PAFScaff, and you need it for this function. Ref chromosomes should be named integers in PAFScaff (1, 2, etc.). This funtion also depends on previous output from get_bgc_outliers() and join_bgc_gff(). See ?get_bgc_outliers and ?join_bgc_gff. You should run BGC and get_bgc_outliers() separately for both transcriptome-aligned data and all other scaffolded loci. Then run join_bgc_gff() on only the transcriptome-aligned data. Both are required as input to make the ideograms. You can also see https://cran.r-project.org/web/packages/RIdeogram/index.html for more info on plotting the RIdeograms.
plot_outlier_ideogram(
prefix,
outliers.genes,
outliers.full.scaffolds,
pafInfo,
plotDIR = "./plots",
both.outlier.tests = FALSE,
both.outlier.tests.genes = FALSE,
overlap.zero = TRUE,
overlap.zero.genes = TRUE,
qn.interval = TRUE,
qn.interval.genes = TRUE,
missing.chrs = NULL,
miss.chr.length = NULL,
gene.size = 5e+05,
other.size = 1e+05,
convert_svg = "pdf",
colorset1 = c("#4575b4", "#ffffbf", "#d73027"),
colorset2 = c("#4575b4", "#ffffbf", "#d73027"),
chrnum.prefix = NULL,
genes.only = FALSE,
linked.only = FALSE
)
prefix |
Prefix for output files |
outliers.full.scaffolds |
List containing outlier data from get_bgc_outliers(). See ?get_bgc_outliers. This must be loci aligned to full scaffolds |
pafInfo |
Path to *.scaffolds.tdt file output from PAFScaff |
plotDIR |
Directory to save output plots |
both.outlier.tests |
Boolean; If TRUE, scaffold outliers must meet both the overlap.zero and qn.interval criteria |
both.outlier.tests.genes |
Boolean; If TRUE, gene outliers must meet both the overlap.zero.genes and qn.interval.genes criteria |
overlap.zero |
Boolean; If TRUE, scaffold outliers are SNPs whose credible interval does not contain zero |
overlap.zero.genes |
Boolean; If TRUE, gene outliers are SNPs whose credible interval does not contain zero |
qn.interval |
Boolean; If TRUE, scaffold outliers fall outside the quantile interval qn/2 and 1-qn/2 |
qn.interval.genes |
Boolean; If TRUE, gene outliers fall outside the quantile interval qn/2 and 1-qn/2 |
missing.chrs |
If specified, must be character vector of missing chromosome names. Chromosome numbers should be prefixed with "chr". I.e., c("chr3", "chr6"). If some chromosomes don't get plotted, use this option |
gene.size |
Adjust the size for each outlier transcriptome gene on the ideogram. If the loci appear too small or large on the ideogram, adjust just this parameter |
other.size |
Adjust the size for each outlier scaffold gene on the ideogram |
convert_svg |
Device to convert SVG output plot. Default is pdf, but you can use png or other commonly used devices |
colorset1 |
Vector of colors for RIdeogram alpha heatmap. Default is the same as the RIdeogram defaults |
colorset2 |
Vector of colors for RIdeogram beta heatmap. Default is the same as the RIdeogram defaults |
chrnum.prefix |
Prefix for chromosome numbers on ideaogram plot |
genes.only |
Boolean; If TRUE, only include known genes on ideogram |
linked.only |
Boolean; If TRUE, only include non-genes on ideogram |
outliers.genes.annotated |
List containing gene outlier data from get_bgc_outliers(). See ?get_bgc_outliers. This must be outliers from a transcriptome alignment |
missing.chr.length |
Vector of integer lengths (in bp) of missing chromosomes. Must also be specified if missing.chrs is used. Vector must also be the same length as missing.chrs |
Function to plot outlier BGC loci as heatmaps on chromosome ideograms.
Data.frame containing reference info for gene outliers.
plot_outlier_ideogram(prefix = "population1",
outliers.genes.annotated = outliers.genes.annotated,
outliers.full.scaffolds = outliers.full.scaffolds,
pafInfo = "./population1.scaffolds.tdt",
plotDIR = "./plots",
both.outlier.tests = TRUE,
missing.chrs = c("chr11", "chr21", "chr25"),
miss.chr.length = c(4997863, 1374423, 1060959),
gene.size = 1e6,
other.size = 5e5,
convert_svg = "png",
colorset1 = c("#4575b4", "#ffffbf", "#d73027"),
colorset2 = c("#fc8d59", "#ffffbf", "#91bfdb")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.