correlated_regions: Correlation between methylation and expression

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/correlated_regions.R

Description

Correlation between methylation and expression

Usage

1
2
3
4
correlated_regions(sample_id, expr, txdb, chr, extend = 50000,
    cov_filter = function(x) sum(x > 0, na.rm = TRUE) > length(x)/2,
    cor_method = "spearman", subgroup = NULL, window_size = 5, window_step = window_size,
    max_width = 10000, raw_meth = FALSE, cov_cutoff = 3, min_dp = 4, col = NULL, genome = "hg19")

Arguments

sample_id

a vector of sample IDs

expr

expression matrix

txdb

a TxDb-class object.

chr

a single chromosome name

extend

extension of gene model, both upstream and downstream

cov_filter

if coverage hook is set in methylation_hooks, this option can be set to filter out CpG sites with low coverage across samples. the value for this option is a function for which the argument is a vector of coverage values for current CpG in all samples. The default setting means the CpG should have coverage in more than half of samples.

cor_method

method for calcualting correlations, pass to cor.

subgroup

subgroup information. If provided, ANOVA test and group mean are applied on each correlated region.

window_size

how many CpG sites in a window

window_step

step of the sliding window, measured in number of CpG sites

max_width

maximum width of a window

raw_meth

whether use raw methylation value (values from raw hook set in methylation_hooks)

cov_cutoff

cutoff for CpG coverage when using raw methylation rate, used for raw methylation. Note when the CpG coverage is too low, the raw methylation rate is not reliable. Raw methylation rate for those CpGs with coverage less this this cutoff is set to NA will be further filtered by min_gp.

min_dp

minimal number of non-NA values for calculating correlations. When meth is the raw methylation, values for which CpG coverage is too low will be replaced with NA, We only use non-NA values to calculate correlations. If the number of data points for calculating correlation is less than min_dp, the CpG window is just excluded.

col

color for subgroups. This setting will be saved in the returned object and will be used in downstream analysis. If not set, random colors are assigned.

genome

genome This setting will be saved and used in downstream analysis

Details

A correlated region is defined as a region where methylation is correlated with the expression of associated gene. The detection for correlated regions is gene-centric. For every gene, the processes are as follows:

Following meta columns are attached to the GRanges objects:

ncpg

number of CpG sites

mean_meth_*

mean methylation in each window in every sample.

corr

correlation between methylation and expression

corr_p

p-value for the correlation test

meth_IQR

IQR of mean methylation if subgroup is not set

meth_anova

p-value from oneway ANOVA test if subgroup is set

meth_diameter

range between maximum mean and minimal mean in all subgroups if subgroup is set

meth_diff

when there are two subgroups, the mean methylation in subgroup 1 substracting mean methylation in subgroup 2.

gene_id

gene id

gene_tss_dist

distance to tss of genes

tx_tss_dist

if genes have multiple transcripts, this is the distance to the nearest transcript

nearest_txx_tss

transcript id of the nearest transcript

This function keeps all the information for all CpG windows. Uses can use cr_add_fdr_column to add fdr columns to the object, filter significant correlated regions by p-value, fdr and meth_diff columns, or use cr_reduce to reduce the significant regions.

Value

A GRanges object which contains correlations and associated statistics for every CpG windows.

The settings for finding correlated regions are stored as the meta data of the GRanges object.

Author(s)

Zuguang Gu <z.gu@dkfz.de>

See Also

Internally, the calculation is done by correlated_regions_by_window.

Examples

1
2
# There is no example
NULL

jokergoo/epik documentation built on Sept. 28, 2019, 9:20 a.m.