predictDIMPclass: Predict DIMP class

predictDIMPclassR Documentation

Predict DIMP class

Description

This function classify each DMP as a control or a treatment DMP

Usage

predictDIMPclass(
  LR,
  model,
  conf.matrix = FALSE,
  control.names = NULL,
  treatment.names = NULL
)

Arguments

LR

A list of GRanges objects obtained through the through MethylIT downstream analysis. Basically, this object is a list of GRanges containing only differentially methylated position (DMPs). The metacolumn of each GRanges must contain the column: Hellinger divergence 'hdiv', total variation 'TV', the probability of potential DMP 'wprob', which naturally are added in the downstream analysis of MethylIT.

model

A classifier model obtained with the function 'evaluateDIMPclass'.

conf.matrix

Optional. Logic, whether a confusion matrix should be returned (default, FALSE, see below).

control.names

Optional. Names/IDs of the control samples, which must be include in the variable LR (default, NULL).

treatment.names

Optional. Names/IDs of the treatment samples, which must be include in the variable LR (default, NULL).

Details

Predictions only makes sense if the query DMPs belong to same methylation context and derive from an experiment accomplished under the same condition set for the DMPs used to build the model.

Value

The same LR object with tow new columns named 'class' and 'posterior' added to each GRanges object from LR (default). Based on the model prediction each DMP is labeled as control 'CT' or as treatment 'TT' in column 'class'. Column 'posterior' provides, for each DMP, the posterior probability that the given DMP can be classified as induced by the 'treatment' (a treatment DMP).

Control DMPs classified as 'treatment' are false positives. However, if the same cytosine position is classified as 'treatment DMP' in both groups, control and treatment, but with higher posterior probability in the treatment group, then this would indicate a reinforcement of the methylation status in such a position induced by the treatment.

If 'conf.matrix' is TRUE and the arguments control.names and treatment.names are provided, then the overall confusion matrix is returned.

Examples


### Load dataset from the package
data(logit_perf, dmps, package = 'MethylIT')
set.seed(123)

### Select a random subset (70%) from each DMP sample
DMPs <- lapply(dmps, function(x) {
            idx <- length(x) * 0.7
            return(x[ sample.int(idx)  ])
    }, keep.attr = TRUE)


### To accomplish the prediction for logistic model
predclass.dmps <- predictDIMPclass(
    LR = DMPs,
    model = logit_perf$model,
    conf.matrix = TRUE,
    control.names =  c('C1', 'C2', 'C3'),
    treatment.names = c('T1', 'T2', 'T3'))

predclass.dmps

### To accomplish the prediction PCA-QDA model
data("pcaQda_perf", package = 'MethylIT')

predclass.dmps <- predictDIMPclass(
    LR = DMPs,
    model = logit_perf$model,
    conf.matrix = TRUE,
    control.names =  c('C1', 'C2', 'C3'),
    treatment.names = c('T1', 'T2', 'T3'))

predclass.dmps

### To accomplish the prediction PCA-LDA model
data("pcaLda_perf", package = 'MethylIT')

predclass.dmps <- predictDIMPclass(
    LR = DMPs,
    model = logit_perf$model,
    conf.matrix = TRUE,
    control.names =  c('C1', 'C2', 'C3'),
    treatment.names = c('T1', 'T2', 'T3'))

predclass.dmps

genomaths/MethylIT documentation built on Feb. 3, 2024, 1:24 a.m.