dPCA: Differential Principal Component Analysis

Description Usage Arguments Details Value Author(s) Examples

Description

Run differential principal component analysis with incorporated dPCA program

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
dPCA(meta, bed, data, sampleId = NULL, groups = 1:2,
    datasets = NULL, transform = NULL, normlen = NULL,
    minlen = 50, lambda = 0.15, fun = function(x)
    sqrt(mean(x^2)), datasetLabels = NULL, groupLabels =
    NULL, qnormalize = TRUE, qnormalizeFirst = FALSE,
    normalize = FALSE, verbose = FALSE, interactive =
    FALSE, useSVD = FALSE, saveFile = FALSE, processedData
    = TRUE, removeLowCoverageChIPseq = FALSE,
    removeLowCoverageChIPseqProbs = 0.1, dPCsigns = NULL,
    nPaired = 0, nTransform = 0, nColMeanCent = 0,
    nColStand = 0, nMColMeanCent = 1, nMColStand = 0,
    dSNRCut = 5, nUsedPCAZ = 0, nUseRB = 0, dPeakFDRCut =
    0.5)

Arguments

meta

Meta information consists three columns (1) filenames, (2) biological groups, and (3) dataset IDs. Can be either data.frame or matrix.

bed

Input genomic regions in data.frame or matrix format, which needs to have four columns in exact this order: chromosome names, start position, end position, ID.

data

Data.frame or matrix which contains the intensity data for each input genomic regions. Columns are sorted in the order of filenames of meta information, can be produced using importBW.

sampleId

Vector of sample IDs used to be tested, equivalent to the line numbers of the meta information.

groups

Vector or List of group IDs used to be tested, specified in the group fields of the meta information. If the input is a list, for example list(1:3,4), all IDs in the first vector from the list are considered as the same group.

datasets

Vector of dataset IDs used to be tested, specified in the datasets fields of the meta information.

transform

Vector of dataset IDs which need to be transformed, power-transformations are applied according to the lambda estimated by boxcox function. If lambda is equal to or less than 0, log-transformations are applied instead.

normlen

Vector of datasets IDs which need to normalized according to the length of the genomic regions.

minlen

Numeric value of minimum length of the genomic regions to be tested, genomic regions less than this threshold are discarded.

normalize

Logical whether data will be normalized to the total library size, default is FALSE.

qnormalize

Logical whether quantile will be applied to all samples, default is TRUE.

fun

Transformation function to be applied to PCs, which produces another PCx field in the output incorporating one or several PCs. Here are some examples:

  • function(x) sqrt(mean(x^2))

  • function(x) log2(mean(2^x))

  • function(x) x[1]

  • function(x) abs(x[1])

  • function(x) x[1]-x[2]

  • function(x) x[1]-x[2]-x[3]

  • function(x) x[1]+x[2]

  • function(x) sum(x)

  • function(x) sum(abs(x))

  • function(x) x[1]/x[2]

  • function(x) max(abs(x))

  • function(x) x[which.max(abs(x))]

and by default, function(x) sqrt(mean(x^2)) is used.

verbose

Logical whether additional information and figures are shown during power-transformation and normalization.

processedData

Logical whether the processed data will be returned with other outputs.

Details

This function filters, normalizes and transforms the desired groups and datasets of the data, then forwards the processed data to a incorporated C program called dPCA (see PMID: 23569280), and outputs the dPCs and other information from the program.

Value

A named list contains three data.frames:

Author(s)

Qi Wang

Examples

1
2
3
data(CLL)
j <- c(1,2,6,8)
res <- dPCA(meta, bed, data, groups=1:2, datasets=j, transform=j, normlen=j, processedData=TRUE, verbose=TRUE)

qwang-big/irene documentation built on May 23, 2019, 1:47 p.m.