calcNormFactors: Calculate normalization factors

Description Usage Arguments Details Value Examples

Description

This function calculates normalization factors using a specified multi-step normalization method from a TCC-class object. The procedure can generally be described as the STEP1-(STEP2-STEP3)n pipeline.

Usage

1
2
3
4
## S4 method for signature 'TCC'
calcNormFactors(tcc, norm.method = NULL, test.method = NULL,
                iteration = TRUE,  FDR = NULL, floorPDEG = NULL, 
                increment = FALSE, ...)

Arguments

tcc

TCC-class object.

norm.method

character specifying a normalization method used in both the STEP1 and STEP3. Possible values are "tmm" for the TMM normalization method implemented in the edgeR package, "edger" (same as "tmm"), and "deseq2" for the method implemented in the DESeq2 package. The default is "tmm".

test.method

character specifying a method for identifying differentially expressed genes (DEGs) used in STEP2: one of "edger", "deseq2", "bayseq", "voom" and "wad". See the "Details" filed in estimateDE for detail. The default is "edger".

iteration

logical or numeric value specifying the number of iteration (n) in the proposed normalization pipeline: the STEP1-(STEP2-STEP3)n pipeline. If FALSE or 0 is specified, the normalization pipeline is performed only by the method in STEP1. If TRUE or 1 is specified, the three-step normalization pipeline is performed. Integers higher than 1 indicate the number of iteration in the pipeline.

FDR

numeric value (between 0 and 1) specifying the threshold for determining potential DEGs after STEP2.

floorPDEG

numeric value (between 0 and 1) specifying the minimum value to be eliminated as potential DEGs before performing STEP3.

increment

logical value. if increment = TRUE, the DEGES pipeline will perform again from the current iterated result.

...

arguments to identify potential DEGs at STEP2. See the "Arguments" field in estimateDE for details.

Details

The calcNormFactors function is the main function in the TCC package. Since this pipeline employs the DEG identification method at STEP2, our multi-step strategy can eliminate the negative effect of potential DEGs before the second normalization at STEP3. To fully utilize the DEG elimination strategy (DEGES), we strongly recommend not to use iteration = 0 or iteration = FALSE. This function internally calls functions implemented in other R packages according to the specified value.

Value

After performing the calcNormFactors function, the calculated normalization factors are populated in the norm.factors field (i.e., tcc$norm.factors). Parameters used for DEGES normalization (e.g., potential DEGs identified in STEP2, execution times for the identification, etc.) are stored in the DEGES field (i.e., tcc$DEGES) as follows:

iteration

the iteration number n for the STEP1 - (STEP2 - STEP3)_{n} pipeline.

pipeline

the DEGES normalization pipeline.

threshold

it stores (i) the type of threshold (threshold$type), (ii) the threshold value (threshold$input), and (iii) the percentage of potential DEGs actually used (threshold$PDEG). These values depend on whether the percentage of DEGs identified in STEP2 is higher or lower to the value indicated by floorPDEG. Consider, for example, the execution of calcNormFactors function with "FDR = 0.1 and floorPDEG = 0.05". If the percentage of DEGs identified in STEP2 satisfying FDR = 0.1 was 0.14 (i.e., higher than the floorPDEG of 0.05), the values in the threshold fields will be threshold$type = "FDR", threshold$input = 0.1, and threshold$PDEG = 0.14. If the percentage (= 0.03) was lower than the predefined floorPDEG value of 0.05, the values in the threshold fields will be threshold$type = "floorPDEG", threshold$input = 0.05, and threshold$PDEG = 0.05.

potDEG

numeric binary vector (0 for non-DEG or 1 for DEG) after the evaluation of the percentage of DEGs identified in STEP2 with the predefined floorPDEG value. If the percentage (e.g., 2%) is lower than the floorPDEG value (e.g., 17%), 17% of elements become 1 as DEG.

prePotDEG

numeric binary vector (0 for non-DEG or 1 for DEG) before the evaluation of the percentage of DEGs identified in STEP2 with the predefined floorPDEG value. Regardless of the floorPDEG value, the percentage of elements with 1 is always the same as that of DEGs identified in STEP2.

execution.time

computation time required for normalization.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
data(hypoData)
group <- c(1, 1, 1, 2, 2, 2)

# Calculating normalization factors using the DEGES/edgeR method 
# (the TMM-edgeR-TMM pipeline).
tcc <- new("TCC", hypoData, group)
tcc <- calcNormFactors(tcc, norm.method = "tmm", test.method = "edger",
                       iteration = 1, FDR = 0.1, floorPDEG = 0.05)
tcc$norm.factors

# Calculating normalization factors using the iterative DEGES/edgeR method 
# (iDEGES/edgeR) with n = 3.
tcc <- new("TCC", hypoData, group)
tcc <- calcNormFactors(tcc, norm.method = "tmm", test.method = "edger",
                       iteration = 3, FDR = 0.1, floorPDEG = 0.05)
tcc$norm.factors

TCC documentation built on Nov. 8, 2020, 8:20 p.m.