bpFitGrid: Identify the Optimal Contrastive and Penalty Parameters in...

View source: R/fitGrid.R

bpFitGridR Documentation

Identify the Optimal Contrastive and Penalty Parameters in Parallel

Description

This function is used to automatically select the optimal contrastive parameter and L1 penalty term for scPCA based on a clustering algorithm and average silhouette width. Analogous to fitGrid, but replaces all lapply calls by bplapply.

Usage

bpFitGrid(
  target,
  target_valid = NULL,
  center,
  scale,
  c_contrasts,
  contrasts,
  penalties,
  n_eigen,
  alg,
  clust_method = c("kmeans", "pam", "hclust"),
  n_centers,
  max_iter = 10,
  linkage_method = "complete",
  clusters = NULL,
  eigdecomp_tol = 1e-10,
  eigdecomp_iter = 1000
)

Arguments

target

The target (experimental) data set, in a standard format such as a data.frame or matrix.

target_valid

A holdout set of the target (experimental) data set, in a standard format such as a data.frame or matrix. NULL by default but used by cvSelectParams for cross-validated selection of the contrastive and penalization parameters.

center

A logical indicating whether the target and background data sets should be centered to mean zero.

scale

A logical indicating whether the target and background data sets should be scaled to unit variance.

c_contrasts

A list of contrastive covariances.

contrasts

A numeric vector of the contrastive parameters used to compute the contrastive covariances.

penalties

A numeric vector of the penalty terms.

n_eigen

A numeric indicating the number of eigenvectors to be computed.

alg

A character indicating the SPCA algorithm used to sparsify the contrastive loadings. Currently supports iterative for the \insertCitezou2006sparse;textualscPCA implemententation, var_proj for the non-randomized \insertCiteerichson2018sparse;textualscPCA solution, and rand_var_proj fir the randomized \insertCiteerichson2018sparse;textualscPCA result.

clust_method

A character specifying the clustering method to use for choosing the optimal constrastive parameter. Currently, this is limited to either k-means, partitioning around medoids (PAM), and hierarchical clustering. The default is k-means clustering.

n_centers

A numeric giving the number of centers to use in the clustering algorithm.

max_iter

A numeric giving the maximum number of iterations to be used in k-means clustering, defaulting to 10.

linkage_method

A character specifying the agglomerative linkage method to be used if clust_method = "hclust". The options are ward.D2, single, complete, average, mcquitty, median, and centroid. The default is complete.

clusters

A numeric vector of cluster labels for observations in the target data. Defaults to NULL, but is otherwise used to identify the optimal set of hyperparameters when fitting the scPCA and the automated version of cPCA.

eigdecomp_tol

A numeric providing the level of precision used by eigendecompositon calculations. Defaults to 1e-10.

eigdecomp_iter

A numeric indicating the maximum number of interations performed by eigendecompositon calculations. Defaults to 1000.

Value

A list similar to that output by prcomp:

  • rotation - the matrix of variable loadings

  • x - the rotated data, centred and scaled, if requested, data multiplied by the rotation matrix

  • contrast - the optimal contrastive parameter

  • penalty - the optimal L1 penalty term

References

\insertAllCited

PhilBoileau/scPCA documentation built on Feb. 6, 2024, 3:33 p.m.