PCADSC: Compute the elements used for PCADSC

Description Usage Arguments Details Value See Also Examples

View source: R/PCADSC.R

Description

Principal Component Analysis-based Data Structure Comparison tools that prepare a dataset for various diagnostic plots for comparing data structures. More specifically, PCADSC performs PCA on two subsets of a dataset in order to compare the structures of these datasets, e.g. to assess whether they can be analyzed pooled or not. The results of the PCAs are then manipulated in various ways and stored for easy plotting using the three PCADSC plotting tools, the CEPlot, the anglePlot and the chromaPlot.

Usage

1
2
PCADSC(data, splitBy, vars = NULL, doCE = TRUE, doAngle = TRUE,
  doChroma = TRUE, B = 10000)

Arguments

data

A dataset, either a data.frame or a matrix with variables in columns and observations in rows. Note that tibbles and data.tables are accepted as input, but they are instantly converted to data.frames. Future releases might include specific implementation for these data representations.

splitBy

The name of a grouping variable with two levels defining the two groups within the dataset whose data structures we wish to compare.

vars

The variable names in data to include in the PCADSC. If NULL (the default), all variables except for splitBy are used.

doCE

Logical. Should the cumulative eigenvalue plot information be computed?

doAngle

Logical. Should the angle plot information be computed?

doChroma

Logical. Should the chroma plot information be computed?

B

A positive integer. The number of resampling steps performed in the cumulative eigenvalue step, if relevant.

Details

PCADSC presents a suite of non-parametric, visual tools for comparing the strucutures of two subsets of a dataset. These tools are all based on PCA (principal component analysis) and thus they can be interpreted as comparisons of the covariance matrices of the two (sub)datasets. PCADSC performs PCA using singular value decomposition for increased numerical precision. Before performing PCA on the full dataset and the two subsets, all variables within each such dataset are standardized.

Value

An object of class PCADSC, which is a named list with the following entries:

pcaRes

The results of the PCAs performed on the first subset, the second subset and the full subset and also information about the data splitting.

CEInfo

The information needed for making a cumulative eigenvalue plot (see CEPlot).

angleInfo

The information needed for making an angle plot (see anglePlot).

chromaInfo

The information needed for making a chroma plot (see chromaPlot).

data

The original (full) dataset.

splitBy

The name of the variable that splits the dataset in two.

vars

The names of the variables in the dataset that should be used for PCA.

B

The number of resamplings performed for the CEInfo.

See Also

doCE, doAngle, doChroma, CEPlot, anglePlot, chromaPlot

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#load iris data
data(iris)

#Define grouping variable, grouping the observations by whether their species is
#Setosa or not
iris$group <- "setosa"
iris$group[iris$Species != "setosa"] <- "non-setosa"
iris$Species <- NULL

## Not run: 
#Make a full PCADSC object, splitting the data by "group"
irisPCADSC <- PCADSC(iris, "group")

#The three plotting functions can now be called on irisPCADSC:
CEPlot(irisPCADSC)
anglePlot(irisPCADSC)
chromaPlot(irisPCADSC)

#Make a partial PCADSC object with no angle plot information and add
#angle plot information afterwards:
irisPCADSC2 <- PCADSC(iris, "group", doAngle = FALSE)
irisPCADSC2 <- doAngle(irisPCADSC)

## End(Not run)

#Make a partial PCADSC obejct with no plotting (angle/CE/chroma)
#information:
irisPCADSC_minimal <- PCADSC(iris, "group", doAngle = FALSE,
  doCE = FALSE, doChroma = FALSE)

PCADSC documentation built on May 2, 2019, 1:09 p.m.

Related to PCADSC in PCADSC...