Home

/

CRAN

/

iClusterVB

/

iClusterVB: Fast Integrative Clustering for High-Dimensional Multi-View...

iClusterVB: Fast Integrative Clustering for High-Dimensional Multi-View...
In iClusterVB: Fast Integrative Clustering and Feature Selection for High Dimensional Data

View source: R/iClusterVB.R

iClusterVB

R Documentation

Fast Integrative Clustering for High-Dimensional Multi-View Data Using Variational Bayesian Inference

Description

iClusterVB offers a novel, fast, and integrative approach to clustering high-dimensional, mixed-type, and multi-view data. By employing variational Bayesian inference, iClusterVB facilitates effective feature selection and identification of disease subtypes, enhancing clinical decision-making.

Usage

iClusterVB(
  mydata,
  dist,
  K = 10,
  initial_method = "VarSelLCM",
  VS_method = 0,
  initial_cluster = NULL,
  initial_vs_prob = NULL,
  initial_fit = NULL,
  initial_omega = NULL,
  input_hyper_parameters = NULL,
  max_iter = 200,
  early_stop = 1,
  per = 10,
  convergence_threshold = 1e-04
)

Arguments

`mydata`	A list of length R, where R is the number of datasets, containing the input data. Note: For categorical data, `0`'s must be re-coded to another, non-`0` value.
`dist`	A vector of length R specifying the type of data or distribution. Options include: 'gaussian' (for continuous data), 'multinomial' (for binary or categorical data), and 'poisson' (for count data).
`K`	The maximum number of clusters, with a default value of 10. The algorithm will converge to a model with dominant clusters, removing redundant clusters and automating the determination of the number of clusters.
`initial_method`	The initialization method for cluster allocation. Options include: "VarSelLCM" (default), "random", "kproto" (k-prototypes), "kmeans" (continuous data only), "mclust" (continuous data only), or "lca" (poLCA, categorical data only).
`VS_method`	The variable/feature selection method. Options are 0 for clustering without variable/feature selection (default) and 1 for clustering with variable/feature selection.
`initial_cluster`	The initial cluster membership. The default is NULL, which uses initial_method for initial cluster allocation. If not NULL, it will override the initial values setting for this parameter.
`initial_vs_prob`	The initial variable/feature selection probability, a scalar. The default is NULL, which assigns a value of 0.5.
`initial_fit`	Initial values based on a previously fitted iClusterVB model (an iClusterVB object). The default is NULL.
`initial_omega`	Customized initial values for feature inclusion probabilities. The default is NULL. If not NULL, it will override the initial values setting for this parameter. If VS_method = 1, initial_omega is a list of length R, with each element being an array with dimensions {dim=c(N, p[[r]])}. Here, N is the sample size and p[[r]] is the number of features for dataset r, where r = 1, ..., R.
`input_hyper_parameters`	A list of the initial hyper-parameters of the prior distributions for the model. The default is NULL, which assigns alpha_00 = 0.001, mu_00 = 0, s2_00 = 100, a_00 = 1, b_00 = 1,kappa_00 = 1, u_00 = 1, v_00 = 1.
`max_iter`	The maximum number of iterations for the VB algorithm. The default is 200.
`early_stop`	Whether to stop the algorithm upon convergence or to continue until `max_iter` is reached. Options are 1 (default) to stop when the algorithm converges, and 0 to stop only when `max_iter` is reached.
`per`	Print information every "per" iterations. The default is 10.
`convergence_threshold`	The convergence threshold for the change in ELBO. The default is 0.0001.

Value

The iClusterVB function creates an object (list) of class iClusterVB. Relevant outputs include:

`elbo`:	The evidence lower bound for each iteration.
`cluster`:	The cluster assigned to each individual.
`initial_values`:	A list of the initial values.
`hyper_parameters`:	A list of the hyper-parameters.
`model_parameters`:	A list of the model parameters after the algorithm is run.

Of particular interest is rho, a list of the posterior inclusion probabilities for the features in each of the data views. This is the probability of including a certain predictor in the model, given the observations. This is only available if VS_method = 1.

Note

If any of the data views are "gaussian", please include them first, both in the input data mydata and correspondingly in the distribution vector dist. For example, dist <- c("gaussian","gaussian", "poisson", "multinomial"), and not dist <- c("poisson", "gaussian","gaussian", "multinomial") or dist <- c("gaussian", "poisson", "gaussian", "multinomial")

Examples

# sim_data comes with the iClusterVB package.
dat1 <- list(
  gauss_1 = sim_data$continuous1_data[c(1:20, 61:80, 121:140, 181:200), 1:75],
  gauss_2 = sim_data$continuous2_data[c(1:20, 61:80, 121:140, 181:200), 1:75],
  poisson_1 = sim_data$count_data[c(1:20, 61:80, 121:140, 181:200), 1:75])

dist <- c(
  "gaussian", "gaussian",
  "poisson")

# Note: `max_iter` is a time-intensive step.
# For the purpose of testing the code, use a small value (e.g. 10).
# For more accurate results, use a larger value (e.g. 200).

fit_iClusterVB <- iClusterVB(
  mydata = dat1,
  dist = dist,
  K = 4,
  initial_method = "VarSelLCM",
  VS_method = 1,
  max_iter = 10
)

# We can obtain a summary using the summary() function
summary(fit_iClusterVB)

iClusterVB documentation built on April 3, 2025, 6:22 p.m.

iClusterVB index

README.md Introduction to iClusterVB"

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

iClusterVB
Fast Integrative Clustering and Feature Selection for High Dimensional Data

iClusterVB: Fast Integrative Clustering for High-Dimensional Multi-View...
In iClusterVB: Fast Integrative Clustering and Feature Selection for High Dimensional Data

Fast Integrative Clustering for High-Dimensional Multi-View Data Using Variational Bayesian Inference

Description

Usage

Arguments

Value

Note

Examples

Related to iClusterVB in iClusterVB...

R Package Documentation

Browse R Packages

We want your feedback!

iClusterVB Fast Integrative Clustering and Feature Selection for High Dimensional Data

iClusterVB: Fast Integrative Clustering for High-Dimensional Multi-View... In iClusterVB: Fast Integrative Clustering and Feature Selection for High Dimensional Data

Fast Integrative Clustering for High-Dimensional Multi-View Data Using Variational Bayesian Inference

Description

Usage

Arguments

Value

Note

Examples

Related to iClusterVB in iClusterVB...

R Package Documentation

Browse R Packages

We want your feedback!

iClusterVB
Fast Integrative Clustering and Feature Selection for High Dimensional Data

iClusterVB: Fast Integrative Clustering for High-Dimensional Multi-View...
In iClusterVB: Fast Integrative Clustering and Feature Selection for High Dimensional Data