addConnections_peak_gene: Add peak-gene connections to a 'GRN' object

View source: R/core.R

addConnections_peak_geneR Documentation

Add peak-gene connections to a GRN object

Description

Add peak-gene connections to a GRN object

Usage

addConnections_peak_gene(
  GRN,
  overlapTypeGene = "TSS",
  corMethod = "pearson",
  promoterRange = 250000,
  TADs = NULL,
  nCores = 4,
  plotDiagnosticPlots = TRUE,
  plotGeneTypes = list(c("all"), c("protein_coding"), c("protein_coding", "lincRNA")),
  outputFolder = NULL,
  addRobustRegression = FALSE,
  forceRerun = FALSE
)

Arguments

GRN

Object of class GRN

overlapTypeGene

Character. "TSS" or "full". Default "TSS". If set to "TSS", only the TSS of the gene is used as reference for finding genes in the neighborhood of a peak. If set to "full", the whole annotated gene (including all exons and introns) is used instead.

corMethod

Character. pearson or spearman. Default pearson. Method for calculating the correlation coefficient. See cor for details.

promoterRange

Integer >=0. Default 250000. The size of the neighborhood in bp to correlate peaks and genes in vicinity. Only peak-gene pairs will be correlated if they are within the specified range. Increasing this value leads to higher running times and more peak-gene pairs to be associated, while decreasing results in the opposite.

TADs

Data frame with TAD domains. Default NULL. If provided, the neighborhood of a peak is defined by the TAD domain the peak is in rather than a fixed-sized neighborhood. The expected format is a BED-like data frame with at least 3 columns in this particular order: chromosome, start, end, the 4th column is optional and will be taken as ID column. All additional columns as well as column names are ignored. For the first 3 columns, the type is checked as part of a data integrity check.

nCores

Integer >0. Default 1. Number of cores to use.

plotDiagnosticPlots

TRUE or FALSE. Default TRUE. Run and plot various diagnostic plots? If set to TRUE, PDF files will be produced and saved in the output directory (in a subfolder called plots).

plotGeneTypes

List of character vectors. Default list(c("all"), c("protein_coding"), c("protein_coding", "lincRNA")). Each list element may consist of one or multiple gene types that are plotted collectively in one PDF. The special keyword "all" denotes all gene types that are found (be aware: this typically contains 20+ gene types, see https://www.gencodegenes.org/pages/biotypes.html for details).

outputFolder

Character or NULL. Default NULL. If set to NULL, the default output folder as specified when initiating the object in link{initializeGRN} will be used. Otherwise, all output from this function will be put into the specified folder. We recommend specifying an absolute path.

addRobustRegression

TRUE or FALSE. EXPERIMENTAL. Default FALSE. Use a robust regression in addition to a non-robust one? Significantly increases overall running time.

forceRerun

TRUE or FALSE. Default FALSE. Force execution, even if the GRN object already contains the result. Overwrites the old results.

Value

The same GRN object, with added data from this function in different flavors.

Examples

# See the Workflow vignette on the GRaNIE website for examples
GRN = loadExampleObject()
GRN = addConnections_peak_gene(GRN, promoterRange=10000, plotDiagnosticPlots = FALSE)

chrarnold/GRaNIE documentation built on April 28, 2022, 2:18 a.m.