SingleCoeditedRegion: Extracts contiguous co-edited genomic regions from a single...

View source: R/SingleCoeditedRegion.R

SingleCoeditedRegionR Documentation

Extracts contiguous co-edited genomic regions from a single genomic region.

Description

Extracts contiguous co-edited genomic regions from an input genomic region.

Usage

SingleCoeditedRegion(
  region_df,
  rnaEditMatrix,
  output = c("GRanges", "dataframe"),
  rDropThresh_num = 0.4,
  minPairCorr = 0.1,
  minSites = 3,
  method = c("spearman", "pearson"),
  minEditFreq = 0.05,
  returnAllSites = FALSE,
  verbose = TRUE
)

Arguments

region_df

A data frame with the input genomic region. Please make sure columns seqnames, start, and end are included in the data frame.

rnaEditMatrix

A matrix (or data frame) of RNA editing level values on individual sites, with row names as site IDs in the form of "chrAA:XXXXXXXX", and column names as sample IDs. Please make sure to follow the format of example dataset (data(rnaedit_df)).

output

Type of output data, can be "GRanges" or "dataframe". Defaults to "GRanges".

rDropThresh_num

Threshold for minimum correlation between RNA editing levels of one site and the mean RNA editing levels of the rest of the sites. Please set a number between 0 and 1. Defaults to 0.4.

minPairCorr

Minimum pairwise correlation coefficient of a cluster is used as a filter to select clusters for output. Only clusters with all pairwise correlations between sites more than minPairCorr will be selected for output. To use this filter, set this argument to a number between -1 and 1 (defaults to 0.1). To turn it off, please set the argument to -1.

minSites

Minimum number of sites to be considered a region. Only regions with more than minSites number of sites will be returned.

method

Method for computing correlations. Defaults to "spearman".

minEditFreq

Threshold for minimum percentage of samples for a given site. The r_drop value of the sites with frequency lower than minEditFreq will be set as NA. Please set a number between 0 and 1. Defaults to 0.05.

returnAllSites

When no co-edited region is found in an input genomic region, returnAllSites = TRUE indicates outputting all the sites from the input region, while returnAllSites = FALSE indicates not returning any site from the input region. Defaults to FALSE.

verbose

Should messages and warnings be displayed? Defaults to TRUE, but is set to FALSE when called from within AllCoeditedRegions().

Value

When output is set to "GRanges", a GRanges object with seqnames, ranges and strand of the contiguous co-edited regions will be returned.

When output is set to "dataframe", a data frame with following columns will be returned:

  • site : site ID.

  • chr : chromosome.

  • pos : genomic location.

  • r_drop : the correlation between RNA editing levels of one site and the mean RNA editing levels of the rest of the sites.

  • keep : indicator for co-edited sites, The sites with keep = 1 belong to the contiguous and co-edited region.

  • keep_contiguous : contiguous co-edited region number.

  • regionMinPairwiseCor : the minimum pairwise correlation of a co-edited region.

  • keep_regionMinPairwiseCor : equals 1 for contiguous co-edited subregions. The regions with keepminPairwiseCor = 1 are the ones that passed the regionMinPairwiseCor filter and will be returned as a co-edited sub-region.

Examples

  data(rnaedit_df)
  
  exm_region <- data.frame(
    seqnames = "chr1",
    start =  28691093,
    end = 28826881, 
    stringsAsFactors = FALSE
  )
  
  SingleCoeditedRegion(
    region_df = exm_region,
    rnaEditMatrix = rnaedit_df,
    minPairCorr = 0.25,
    output = "dataframe",
    method = "spearman"
  )
   

TransBioInfoLab/rnaEditr documentation built on Nov. 29, 2022, 3:31 p.m.