Description Usage Arguments Details Value Author(s) References Examples
RSimca performs a robust version of the SIMCA method. This method classifies a data matrix x with a known group structure. To reduce the dimension on each group a robust PCA analysis is performed. Afterwards a classification rule is developped to determine the assignment of new observations.
1 2 3 4 5 6 |
formula |
a formula of the form |
data |
an optional data frame (or similar: see
|
subset |
an optional vector used to select rows (observations) of the
data matrix |
na.action |
a function which indicates what should happen
when the data contain |
x |
a matrix or data frame containing the explanatory variables (training set). |
grouping |
grouping variable: a factor specifying the class for each observation. |
prior |
prior probabilities, default to the class proportions for the training set. |
tol |
tolerance |
control |
a control object (S4) for specifying one of the available PCA estimation methods and containing estimation options. The class of this object defines which estimator will be used. Alternatively a character string can be specified which names the estimator - one of auto, hubert, locantore, grid, proj. If 'auto' is specified or the argument is missing, the function will select the estimator (see below for details) |
alpha |
this parameter measures the fraction of outliers the algorithm should resist. In MCD alpha controls the size of the subsets over which the determinant is minimized, i.e. alpha*n observations are used for computing the determinant. Allowed values are between 0.5 and 1 and the default is 0.5. |
k |
number of principal components to compute. If |
kmax |
maximal number of principal components to compute.
Default is |
trace |
whether to print intermediate results. Default is |
... |
arguments passed to or from other methods. |
RSimca
, serving as a constructor for objects of class RSimca-class
is a generic function with "formula" and "default" methods.
SIMCA is a two phase procedure consisting of PCA performed on each group
separately for dimension reduction followed by classification rules built
in the lower dimensional space (note that the dimension in
each group can be different). Instead of classical PCA robust alternatives will be used.
Any of the robust PCA methods available in package Pca-class
can be used through the argument control
.
In original SIMCA new observations are
classified by means of their deviations from the different PCA models.
Here the classification rules will be obtained using two popular distances arising from PCA -
orthogonal distances (OD) and score distances (SD). For the definition of these distances,
the definition of the cutoff values and the standartization of the distances see
Vanden Branden K, Hubert M (2005) and Todorov and Filzmoser (2009).
An S4 object of class RSimca-class
which is a subclass of of the
virtual class Simca-class
.
Valentin Todorov valentin.todorov@chello.at
Vanden Branden K, Hubert M (2005) Robust classification in high dimensions based on the SIMCA method. Chemometrics and Intellegent Laboratory Systems 79:10–21
Todorov V & Filzmoser P (2014), Software Tools for Robust Analysis of High-Dimensional Data. Austrian Journal of Statistics, 43(4), 255–266, doi: 10.17713/ajs.v43i4.44.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | data(pottery)
dim(pottery) # 27 observations in 2 classes, 6 variables
head(pottery)
## Build the SIMCA model. Use RSimca for a robust version
rs <- RSimca(origin~., data=pottery)
rs
summary(rs)
## generate a sample from the pottery data set -
## this will be the "new" data to be predicted
smpl <- sample(1:nrow(pottery), 5)
test <- pottery[smpl, -7] # extract the test sample. Remove the last (grouping) variable
print(test)
## predict new data
pr <- predict(rs, newdata=test)
pr@classification
|
Loading required package: rrcov
Loading required package: robustbase
Scalable Robust Estimators with High Breakdown Point (version 1.4-3)
Robust Multivariate Methods for High Dimensional Data (version 0.2-5)
[1] 27 7
SI AL FE MG CA TI origin
1 55.8 14.0 10.2 4.9 5.0 0.88 Attic
2 51.2 12.5 10.1 4.4 4.8 0.86 Attic
3 57.1 14.0 8.3 6.4 11.2 0.75 Attic
4 53.8 13.1 9.3 4.9 6.6 0.81 Attic
5 59.4 14.8 9.8 5.5 5.4 0.89 Attic
6 56.2 14.0 9.9 4.9 5.4 0.89 Attic
Call:
RSimca(origin ~ ., data = pottery)
Prior Probabilities of Groups:
Attic Eritrean
0.4814815 0.5185185
Pca objects for Groups:
Call:
PcaHubert(x = class, k = k[i], kmax = kmax[i], trace = trace)
Importance of components:
PC1 PC2
Standard deviation 6.1708 1.00333
Proportion of Variance 0.9742 0.02576
Cumulative Proportion 0.9742 1.00000
Call:
PcaHubert(x = class, k = k[i], kmax = kmax[i], trace = trace)
Importance of components:
PC1
Standard deviation 3.645
Proportion of Variance 1.000
Cumulative Proportion 1.000
Call:
RSimca(formula = origin ~ ., data = pottery)
Prior Probabilities of Groups:
Attic Eritrean
0.4814815 0.5185185
Pca objects for Groups:
Call:
PcaHubert(x = class, k = k[i], kmax = kmax[i], trace = trace)
Importance of components:
PC1 PC2
Standard deviation 6.1708 1.00333
Proportion of Variance 0.9742 0.02576
Cumulative Proportion 0.9742 1.00000
Call:
PcaHubert(x = class, k = k[i], kmax = kmax[i], trace = trace)
Importance of components:
PC1
Standard deviation 3.645
Proportion of Variance 1.000
Cumulative Proportion 1.000
SI AL FE MG CA TI
19 53.9 16.5 9.5 2.5 4.7 0.85
22 48.1 15.8 9.8 2.0 5.6 0.83
26 57.2 17.3 9.3 2.7 5.6 0.83
12 69.9 13.1 9.8 4.4 4.4 0.89
21 54.5 17.3 9.8 2.6 4.9 0.85
[1] Eritrean Eritrean Eritrean Eritrean Eritrean
Levels: Attic Eritrean
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.