Description Usage Arguments Value See Also Examples
If using a function like irlba() to calculate PCA, then you can choose (for speed) to only calculate a subset of the eigenvalues. So there is no exact percentage of variance explained by the PCA, or by each component as you will get as output from other routines. This code uses a linear, or b*1/x model, to estimate the AUC for the unknown eigenvalues, providing a reasonable estimate of the variances accounted for by each unknown eigenvalue, and the predicted eigenvalue sum of the unknown eigenvalues.
1 2 3 4 |
eigenv |
the vector of eigenvalues actually calculated |
min.dim |
the size of the smaller dimension of the matrix submitted to singular value decomposition, e.g, number of samples - i.e, the max number of possible eigenvalues, alternatively use 'M'. |
M |
optional enter the original dataset 'M'; simply used to derive the dimensions, alternatively use 'min.dim'. |
elbow |
the number of components which you think explain the important portion of the variance of the dataset, so further components are assumed to be reflecting noise or very subtle effects, e.g, often the number of components used is decided by the 'elbow' in a scree plot (see 'pca.scree.plot') |
linear |
whether to use a linear model to model the 'noise' eigenvalues; alternative is a 1/x model with no intercept. |
estimated |
logical, whether to return the estimated variance percentages for unobserved eigenvalues along with the real data; will also generate a factor describing which values in the returned vector are observed versus estimated. |
print.est |
whether to output the estimate result to the console |
print.coef |
whether to output the estimate regression coefficients to the console |
add.fit.line |
logical, if there is an existing scree plot, adds the fit line from this estimate to the plot ('pca.scree.plot' can use this option using the parameter of the same name) |
col |
colour for the fit line |
ignore.warn |
ignore warnings when an estimate is not required (i.e, all eigenvalues present) |
By default returns a list where the first element ā€¯variance.pcs' are the known variance percentages for each eigenvalue based on the estimated divisor, the second element 'tail.auc' is the area under the curve for the estimated eigenvalues. If estimate =TRUE then a third element is return with separate variance percentages for each of the estimated eigenvalues.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | nsamp <- 100; nvar <- 300; subset.size <- 25; elbow <- 6
mat <- matrix(rnorm(nsamp*nvar),ncol=nsamp)
# or use: # mat <- crimtab-rowMeans(crimtab) ; subset.size <- 10 # crimtab centred
prv.large(mat)
pca <- svd(mat,nv=subset.size,nu=0) # calculates subset of V, but all D
require(irlba)
pca2 <- irlba(mat,nv=subset.size,nu=0) # calculates subset of V & D
pca3 <- princomp(mat,cor=TRUE) # calculates all
# number of eigenvalues for svd is the smaller dimension of the matrix
eig.varpc <- estimate.eig.vpcs(pca$d^2,M=mat)$variance.pcs
cat("sum of all eigenvalue-variances=",sum(eig.varpc),"\n")
print(eig.varpc[1:elbow])
# number of eigenvalues for irlba is the size of the subset if < min(dim(M))
eig.varpc <- estimate.eig.vpcs((pca2$d^2)[1:subset.size],M=mat)$variance.pcs
print(eig.varpc[1:elbow]) ## using 1/x model, underestimates total variance
eig.varpc <- estimate.eig.vpcs((pca2$d^2)[1:subset.size],M=mat,linear=TRUE)$variance.pcs
print(eig.varpc[1:elbow]) ## using linear model, closer to exact answer
eig.varpc <- estimate.eig.vpcs((pca3$sdev^2),M=mat)$variance.pcs
print(eig.varpc[1:elbow]) ## different analysis, but fairly similar var.pcs
|
Loading required package: reader
Loading required package: NCmisc
Attaching package: 'reader'
The following objects are masked from 'package:NCmisc':
cat.path, get.ext, rmv.ext
Loading required package: bigmemory
Loading required package: biganalytics
Loading required package: foreach
Loading required package: biglm
Loading required package: DBI
Warning messages:
1: replacing previous import 'reader::cat.path' by 'NCmisc::cat.path' when loading 'bigpca'
2: replacing previous import 'reader::get.ext' by 'NCmisc::get.ext' when loading 'bigpca'
3: replacing previous import 'reader::rmv.ext' by 'NCmisc::rmv.ext' when loading 'bigpca'
col#
row# 1 2 ..... 100
1 -0.091 -0.3784 ... 1.3543
2 -0.5153 -0.148 ... 1.8725
3 -0.503 0.7059 ... 0.4792
... ... ... ... ...
300 -2.0728 -1.3992 ... -0.9845
Loading required package: irlba
Loading required package: Matrix
All eigenvalues present, estimate not required
sum of all eigenvalue-variances= 1
[1] 0.02390282 0.02373479 0.02255394 0.02214702 0.02172337 0.02091343
estimate of eigenvalue sum of 75 uncalculated eigenvalues: 7356.386
[1] 0.03391549 0.03367708 0.03200158 0.03142422 0.03082309 0.02967388
estimate of eigenvalue sum of 75 uncalculated eigenvalues: 7356.386
[1] 0.03391549 0.03367708 0.03200158 0.03142422 0.03082309 0.02967388
All eigenvalues present, estimate not required
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
0.02413550 0.02365197 0.02295332 0.02195104 0.02164209 0.02071138
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.