mmetaplot: Generate metagene plots for multiple bam files

View source: R/metagene_V2.R

mmetaplotR Documentation

Generate metagene plots for multiple bam files

Description

Generate metagene plots and other information for multiple bam files, such as the FPM value of each bp position in the metagene plots, FPKM values in specific TSS, TTS, or gene body regions of each gene, etc.

Usage

mmetaplot(
  metafiles,
  targetgenefile = NULL,
  genomename = NULL,
  tssradius = c(2000, 1000, 500),
  ttsradius = c(2000, 1000, 500),
  genebodylen = 4000,
  labels,
  strandmethod = 1,
  threads = 1,
  savegenenames = NULL,
  plotgenenames = TRUE,
  mergecases = FALSE,
  genelencutoff = NULL,
  fpkmcutoff = 1,
  textsize = 13,
  titlesize = 15,
  face = "bold"
)

Arguments

metafiles

The bam files needed to generate the metagene plots. Should be a vector with elements as strings indicating the directories of the bam files.

targetgenefile

The genes whose FPM values need to be merged together to generate the metagene plots. If it is NULL, all the genes in the genome specified by the parameter genomename will be analyzed. If provided by the user, columns named such as chr, start, end, strand, and gene_id are required. All genes should be longer than the maximum value of the parameters tssradius, ttsradius, and genelencutoff.

genomename

Specify the genome of the genes to be analyzed, when the parameter targetgenefile is NULL. Can be "hg38" or "mm10".

tssradius

If want to plot the metagene in TSS region, should set this parameter as a vector with each element corresponding to a radius of the TSS region centering around the TSS site. Then, for each radius value, a metagene plot will be generated.

ttsradius

If want to plot the metagene in TTS region, should set this parameter as a vector with each element corresponding to a radius of the TTS region centering around the TTS site. Then, for each radius value, a metagene plot will be generated.

genebodylen

If want to plot the metagene in gene body region, set this parameter as a specific numeric value (bp). Because different genes have different gene body lengths, before merge their FPM values to the metagene, their gene body lengths should be scaled to a unified length, which is set by this parameter genebodylen. Genes with a longer length will be compressed to this gene length, and the ones with a shorter length will be exteneded.

labels

A vector with elements as strings to be included in the titles of the metagene plots to indicate the experimental conditions of the bam files. There should be no replicated elements in this vector.

strandmethod

Indicate the strand specific method used when preparing the sequencing libraries, can be 1 for the directional ligation method, 2 for the dUTP method, and 0 for non-strand specific libraries. In addition, if the samples are sequenced using a single strand method, set it as 3.

threads

Number of threads to perform parallelization. Default is 1.

savegenenames

For which genes their concrete FPM value for each bp position need to be saved, or plotted.

plotgenenames

Whether to plot the FPM value for each bp position for the genes provided by the parameter savegenenames.

mergecases

Whether merge the data of all bam files together to one, and then use it to generate the metagene plots. Default is FALSE.

genelencutoff

The cutoff on gene lengths (bp). The default value is NULL, but if it is set, only genes with a length longer than this cutoff will be considered for the metagene plotting.

fpkmcutoff

The cutoff on gene FPKM values. Only genes with an FPKM value greater than the cutoff will be considered. Default is 1.

textsize

The font size for the plot texts. Default is 13.

titlesize

The font size for the plot titles. Default is 15.

face

The font face for the plot texts. Default is "bold".

Value

Will generate several metagene plots as well as a list with several sub-lists including the information of the FPKM values on specific regions for each gene, the concrete FPM value on each bp position for the metagene plots, and the genes indicated by the parameter savegenenames, etc.

Examples

library(proRate)

wt0file <- system.file("extdata", "wt0.bam", package = "proRate")
ko0file <- system.file("extdata", "ko0.bam", package = "proRate")

metareslist <- mmetaplot(metafiles = c(wt0file, ko0file), 
                        labels = c("WT", "KO"), 
                        tssradius = c(1000, 500), 
                        ttsradius = c(1000), 
                        genebodylen = 2000, 
                        strandmethod = 1, 
                        genomename = "mm10", 
                        genelencutoff = 40000, 
                        fpkmcutoff = 1)




yuabrahamliu/proRate documentation built on Nov. 3, 2024, 10:14 a.m.