predictCondition: Predict Condition
In Bishop-Laboratory/RLSeq: RLSeq: An analysis package for R-loop mapping data

predictCondition

R Documentation

Predict Condition

Description

Uses the results of analyzeRLFS to predict whether a sample is "POS" (robust R-loop mapping) or "NEG" (poor R-loop mapping). See details.

Usage

predictCondition(object, rlfsRes = NULL, ...)

Arguments

`object`	An RLRanges object with analyzeRLFS already run. Ignored if `rlfsRes` is provided.
`rlfsRes`	If object not supplied, provide the rlfsRes list which is obtained from `rlresult(object, "rlfsRes")`.
`...`	Internal use only.

Details

Following R-loop forming sequences (RLFS) analysis, the quality model (see RLHub::models) is implemented for predicting the sample condition in coordination with other results from analyzeRLFS. A prediction of “POS” indicates robust R-loop mapping, whereas “NEG” indicates poor R-loop mapping. The succeeding sections describe this process in greater detail.

Application of binary classification model

First, the binary classifier is applied, yielding a preliminary prediction of quality. This is accomplished via the following steps:

Calculate the Fourier transform of the Z-score distribution (see analyzeRLFS).
Reduce the dimensions to the engineered feature set (see table below).
Apply the preprocessing model (see RLHub::models) to normalize these features
Apply the classifier (see RLHub::models) to render a quality prediction.

Engineered feature set

Abbreviations: Z, Z-score distribution; ACF, autocorrelation function; FT, Fourier Transform.

feature	description
Z1	mean of Z
Z2	variance of Z
Zacf1	mean of Z ACF
Zacf2	variance of Z ACF
ReW1	mean of FT of Z (real part)
ReW2	variance of FT of Z (real part)
ImW1	mean of FT of Z (imaginary part)
ImW2	variance of FT of Z (imaginary part)
ReWacf1	mean of FT of Z ACF (real part)
ReWacf2	variance of FT of Z ACF (real part)
ImWacf1	mean of FT of Z ACF (imaginary part)
ImWacf2	variance of FT of Z ACF (imaginary part)

Final quality prediction

The results from the binary classifier are combined with other results from analyzeRLFS to yield a final prediction. To yield a prediction of “POS” all the following must be TRUE:

The RLFS Permutation test P value is significant (p < .05). Stored as PVal Significant in the results object.
The Z-score distribution at 0bp is > 0. Stored as ZApex > 0 in the results object.
The Z-score distribution at 0bp is > the start and the end. Sored as ZApex > ZEdges in the results object.
binary The classifier predicts a label of “POS”. Stored as Predicted 'POS' in the results object.

Value

An RLRanges object with predictions accessible via rlresult(object, "predictRes").

Structure

The results object is a named list of the structure:

Features
- A tbl with three columns that describe the engineered features used for prediction:
  - feature: the name of the feature (see details).
  - raw_value: The raw value of that feature in the supplied object.
  - processed_value: The normalized value of that feature after preprocessing (see details).
Criteria
- The four criteria which must all be TRUE to render a prediction of "POS" (see details).
prediction
- The final prediction. "POS" indicates robust R-loop mapping, "NEG" indicates poor R-loop mapping.

Examples


# Example data with analyzeRLFS already run
rlr <- readRDS(system.file("extdata", "rlrsmall.rds", package = "RLSeq"))

# predict condition
rlr <- predictCondition(rlr)

# With rlfsRes
predRes <- predictCondition(rlfsRes = rlresult(rlr, "rlfsRes"))

Bishop-Laboratory/RLSeq documentation built on Jan. 28, 2023, 11:38 p.m.