mixedcirc_remove_trend: Removes linear trend from the data

View source: R/mixedcirc_remove_trend.R

mixedcirc_remove_trendR Documentation

Removes linear trend from the data

Description

This functions performs trend removing using mixed models.

Usage

mixedcirc_remove_trend(
  data_input = NULL,
  time = NULL,
  group = NULL,
  id = NULL,
  lm_method = c("lm", "lme")[2],
  obs_weights = NULL,
  RRBS = FALSE,
  replicate_id = NULL,
  remove_trend_separate_groups = TRUE,
  force_weight_estimation = FALSE,
  ncores = 1,
  verbose = FALSE,
  ...
)

Arguments

data_input

A numerical matrix or data.frame (N*P) or DGEList where in the rows are samples (N) and the columns are variables (P). If DGEList is provided, regression weights will be estimated!

time

A vector of length N, showing circadian time of each sample

group

A character vector of length N. If performing differential circadian rhythm analysis, group is a factor, showing grouping of the samples. Analysis of two groups is supported at this stage! See details!

id

A vector of length N showing identity of each *unique* sample. See details

lm_method

The regression method to use. At this stage, 'lm' and 'lme' are supported! If lm is selected, normal regression will be performed. Default: "lme"

obs_weights

Regression weights. Default: NULL. See details

RRBS

If 'TRUE', the data is assumed to be RRBS methylation data. if TRUE, obs_weights must be set. Default FALSE

replicate_id

If 'RRBS' is set to 'TRUE', This has to be a factor showing identity of each unique replicate.

remove_trend_separate_groups

If TRUE, the detrending is performed separately on each group (default:TRUE)

force_weight_estimation

If TRUE, variance-mean trend weight estimation will be performed regardless of the data input type (default: FALSE)

ncores

number of cores

verbose

Show information about different stages of the processes. Default FALSE

...

additionl arguments to the regression function

Details

For each variable we use the following mode: In this part we do rhythmicity analysis on individual variables using the following model: measure ~time The residulas will be outputed

'obs_weights' is a matrix of size N*P where each colum shows the weights for all the observations for that particular variable.

If 'RRBS' is set to 'TRUE', we assume that the data is RRBS methylation. In this case, the regression will be change to suit this type of analysis. In BS-seq methylation analysis, each DNA sample generates two counts, a count of methylated reads and a count of unmethylated reads, for each genomic locus for each sample. The samples are assumed to be ordered as methylated and then unmethylated. For example, given the samples are A1_1, A1_2,A2_1, and B1_1. The rows in 'data_input' are assumed to be ordered as A1_1_methylated,A1_1_unmethylated, A1_2_methylated,A1_2_unmethylated,A2_1_methylated,A2_1_unmethylated,B1_1_methylated,B1_1_unmethylated. In this setting, 'replicate_id' must show the unique identify of each replicate (in contrast to unique biological sample). This means for example above, 'replicate_id' would be A1_1,A1_2,A2_1,B1_1. Please note that if 'RRBS' is set to 'TRUE', 'obs_weights' must be set. We provide a starting function to estimate these weights (mixedcirc_rrbs_voom) for class of 'methylBaseDB' from 'methylKit' package. Alternatively, one can use 'voomWithDreamWeights' with correct formula.

Value

A data.frame

Examples

data("circa_data")

results<-mixedcirc_remove_trend(data_input = circa_data$data_matrix,
time = circa_data$time,group = circa_data$group,id = circa_data$id,verbose = TRUE)


PayamEmami/mixedcirc documentation built on Jan. 15, 2025, 5:36 p.m.