Description Usage Arguments Details Value See Also Examples
Principal Component Analysis-based Data Structure Comparison tools that
prepare a dataset for various diagnostic plots for comparing data structures. More
specifically, PCADSC
performs PCA on two subsets of a dataset in order to
compare the structures of these datasets, e.g. to assess whether they can be analyzed pooled
or not. The results of the PCAs are then manipulated in various
ways and stored for easy plotting using the three PCADSC plotting tools, the CEPlot
,
the anglePlot
and the chromaPlot
.
1 2 |
data |
A dataset, either a |
splitBy |
The name of a grouping variable with two levels defining the two groups within the dataset whose data structures we wish to compare. |
vars |
The variable names in |
doCE |
Logical. Should the cumulative eigenvalue plot information be computed? |
doAngle |
Logical. Should the angle plot information be computed? |
doChroma |
Logical. Should the chroma plot information be computed? |
B |
A positive integer. The number of resampling steps performed in the cumulative eigenvalue step, if relevant. |
PCADSC presents a suite of non-parametric, visual tools for comparing the strucutures of
two subsets of a dataset. These tools are all based on PCA (principal component analysis) and
thus they can be interpreted as comparisons of the covariance matrices of the two (sub)datasets.
PCADSC
performs PCA using singular value decomposition for increased numerical precision.
Before performing PCA on the full dataset and the two subsets, all variables within each such
dataset are standardized.
An object of class PCADSC
, which is a named list with the following entries:
The results of the PCAs performed on the first subset, the second subset and the full subset and also information about the data splitting.
The information needed for making a cumulative eigenvalue plot
(see CEPlot
).
The information needed for making an angle plot
(see anglePlot
).
The information needed for making a chroma plot
(see chromaPlot
).
The original (full) dataset.
The name of the variable that splits the dataset in two.
The names of the variables in the dataset that should be used for PCA.
The number of resamplings performed for the CEInfo
.
doCE
, doAngle
, doChroma
,
CEPlot
, anglePlot
, chromaPlot
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | #load iris data
data(iris)
#Define grouping variable, grouping the observations by whether their species is
#Setosa or not
iris$group <- "setosa"
iris$group[iris$Species != "setosa"] <- "non-setosa"
iris$Species <- NULL
## Not run:
#Make a full PCADSC object, splitting the data by "group"
irisPCADSC <- PCADSC(iris, "group")
#The three plotting functions can now be called on irisPCADSC:
CEPlot(irisPCADSC)
anglePlot(irisPCADSC)
chromaPlot(irisPCADSC)
#Make a partial PCADSC object with no angle plot information and add
#angle plot information afterwards:
irisPCADSC2 <- PCADSC(iris, "group", doAngle = FALSE)
irisPCADSC2 <- doAngle(irisPCADSC)
## End(Not run)
#Make a partial PCADSC obejct with no plotting (angle/CE/chroma)
#information:
irisPCADSC_minimal <- PCADSC(iris, "group", doAngle = FALSE,
doCE = FALSE, doChroma = FALSE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.