pc_stability: PC stability by nonparametric bootstrapping

View source: R/user_functions.R

pc_stabilityR Documentation

PC stability by nonparametric bootstrapping

Description

Extract nonparametric estimate of the standardized loadings and the confident region by means of bootstrapping. This function uses the boot function from the boot package.

Usage

pc_stability(
  pca,
  pca_data,
  ndim = 3,
  B = 1000,
  sim = "ordinary",
  communalities = TRUE,
  similarity_metric = "all",
  s_cut_off = 0.1,
  ci_type = "bca",
  conf = 0.95,
  inParallel = F,
  n_cores = 2,
  inParallel_extra = NULL,
  ...
)

Arguments

pca

Object of class prcomp, princals.

pca_data

Data passed to the prcomp or princals function.

ndim

Numeric. Number of PCs (1 to ndim) to run the analysis on. Default = 3.

B

Numeric. Number of bootstrapped samples passed to boot.

sim

Character. Determines the bootstrapping method for boot. From boot: A character string indicating the type of simulation required. Possible values are "ordinary" (the default), "parametric", "balanced", "permutation", or "antithetic". Default="ordinary".

communalities

Logical. Whether to compute and return communalities.

similarity_metric

character or character vector. Specify the similarity metric to use. Set "none" to not return similarity metrics. Possible values are "cc_index" (congruence coefficient), "r_correlation" (Pearson's r), "rmse" (root mean squared error), "s_index' (Cattell's s metric), or "all". See ?component_similarity for more details on the available metrics. Default="all".

s_cut_off

Numeric. This is the loading cut off used to determine if a variable is silent or not in Cattell's terms. See ?extract_s for more information. Default=0.1.

ci_type

Character. Type of confidence interval to compute. This argument is passed to the boot.ci function from the boot package. See ?boot.ci for options. Given that the BCA method has demonstrated good performance for bootstrapping PCAs, we have set 'bca' as the default. See ref for more details.

conf

Numeric. Level of confidence region for the confidence interval. E.g. 0.95 generates 95CI. Default=0.95

inParallel

Logical. Whether to run the function in parallel using pbapply::pblapply. In window machines, parallelization is done through parallel::parLapply. In Unix machines, parallelization is done through parallel::mclapply. See ?pblapply for details. Default = F.

n_cores

Numeric. Number of cores to use. Available cores can be obtained by parallel::detectCores(). It is recommended to use less cores than available. Default = 2.

inParallel_extra

Character or vector Character with the string name of additional objects passed to the clusters when performing bootstrapping. This might be needed when the function to run the original PCA has been called with external objects. E.g., princals(..., ndim = ncol(dataFrame)), where dataFrame is a data frame object. In that case inParallel_extra = "dataFrame".

...

Other arguments passed to the boot function.

Details

The number of bootstrap samples is set to 1000 by default, as it has been shown to be a robust number in most conditions of data complexity and sample size. The user must be careful on setting such number too low which would reduce the performance of the approximation. However, values that are too high might unnecessarily increase computing time with little gain (REFs).

Value

Returns a list object of class "syndromics".

method

Specify the method used to obtain the syndromics list object

pca

Contains the object passed to the pca argument

pca_data

Contains the object passed to the pca_data argument

ndim

Value specified in the ndim argument

ci_method

Method used to compute CIs

conf

Confidence level used to compute CIs

results

Object containing the results of the analysis

boot_sample

A list of length B containing the resampled loadings.

pc_similarity

A list of results when similarity_metric is not "none".

similarity_mean

A numerix matrix with the mean of the chosen similarity metric.

similarity_ci_low and similarity_ci_high

A numeric matrix with the lower and upper CI respectively.

B

Number of resamples computed.

communalities

If communalities= TRUE, the results are returned here.

Author(s)

Abel Torres Espin

References

Efron B. Better Bootstrap Confidence Intervals. J Am Stat Assoc. 1987 Mar 1;82(397):171–85.

Examples

data(mtcars)
pca_mtcars<-prcomp(mtcars, center = TRUE, scale. = TRUE)

pca_mtcars_stab<-pc_stability(pca = pca_mtcars, pca_data = mtcars, ndim = 3, B = 500)
plot(pca_mtcars_stab, plot_resample= TRUE)


ucsf-ferguson-lab/syndRomics documentation built on June 26, 2022, 5:36 p.m.