| scregclust | R Documentation |
Use the scRegClust algorithm to determine gene modules and their regulatory programs from single-cell data.
scregclust(
expression,
genesymbols,
is_regulator,
penalization,
n_modules,
initial_target_modules = NULL,
sample_assignment = NULL,
center = TRUE,
split1_proportion = 0.5,
total_proportion = 1,
split_indices = NULL,
prior_indicator = NULL,
prior_genesymbols = NULL,
prior_baseline = 1e-06,
prior_weight = 0.5,
min_module_size = 0L,
allocate_per_obs = TRUE,
noise_threshold = 0.025,
n_cycles = 50L,
use_kmeanspp_init = TRUE,
n_initializations = 50L,
max_optim_iter = 10000L,
tol_coop_rel = 1e-08,
tol_coop_abs = 1e-12,
tol_nnls = 1e-04,
compute_predictive_r2 = TRUE,
compute_silhouette = FALSE,
nowarnings = FALSE,
verbose = TRUE,
quick_mode = FALSE,
quick_mode_percent = 0.1
)
expression |
|
genesymbols |
A vector of gene names corresponding to rows of
|
is_regulator |
An indicator vector where |
penalization |
Sparsity penalty related to the amount of regulators associated with each module. Either a single positive number or a vector of positive numbers. |
n_modules |
Requested number of modules (integer).
If this is provided without specifying |
initial_target_modules |
The initial assignment of target genes to
modules of length |
sample_assignment |
A vector of sample assignment for each cell, can
be used to perform the data splitting with
stratification. Has to be of length |
center |
Whether or not genes should be centered within each subgroup
defined in |
split1_proportion |
The proportion to use for the first dataset during
data splitting. The proportion for the second
dataset is |
total_proportion |
Can be used to only use a proportion of the supplied
observations. The proportion of the first dataset
during data splitting in relation to the full
dataset will be
|
split_indices |
Can be used to provide an explicit data split. If this
is supplied then |
prior_indicator |
An indicator matrix (sparse or dense) of size |
prior_genesymbols |
A vector of gene names of length q corresponding
to the rows/columns in |
prior_baseline |
A positive baseline for the network prior. The larger this parameter is, the less impact the network prior will have. |
prior_weight |
A number between 0 and 1 indicating the strength of the prior in relation to the data. 0 ignores the prior and makes the algorithm completely data-driven. 1 uses only the prior during module allocation. |
min_module_size |
Minimum required size of target genes in a module. Smaller modules are emptied. |
allocate_per_obs |
Whether module allocation should be performed for
each observation in the second data split separately.
If |
noise_threshold |
Threshold for the best |
n_cycles |
Number of maximum algorithmic cycles. |
use_kmeanspp_init |
Use kmeans++ for module initialization if
|
n_initializations |
Number of kmeans(++) initialization runs. |
max_optim_iter |
Maximum number of iterations during optimization in the coop-Lasso and NNLS steps. |
tol_coop_rel |
Relative convergence tolerance during optimization in the coop-Lasso step. |
tol_coop_abs |
Absolute convergence tolerance during optimization in the coop-Lasso step. |
tol_nnls |
Convergence tolerance during optimization in the NNLS step. |
compute_predictive_r2 |
Whether to compute predictive |
compute_silhouette |
Whether to compute silhouette scores for each target gene. |
nowarnings |
When turned on then no warning messages are shown. |
verbose |
Whether to print progress. |
quick_mode |
Whether to use a reduced number of noise targets to speed up computations. |
quick_mode_percent |
A number in [0, 1) indicating the amount of
noise targets to use in the re-allocation process
if |
A list with S3 class scregclust containing
penalization |
The supplied |
results |
A list of result lists (each with S3 class
|
initial_target_modules |
Initial allocation of target genes into modules. |
split_indices |
either verbatim the vector given as input or a vector encoding the splits as NA = not included, 1 = split 1 or 2 = split 2. Allows reproducibility of data splits. |
For each supplied penalization parameter, results contains a list with
the current penalization parameter,
the supplied genesymbols after filtering (as used during fitting),
the supplied is_regulator vector after filtering (as used during
fitting),
the number of fitted modules n_modules,
whether the current run converged to a single configuration (as a
boolean),
as well as an output object containing the numeric results for each
final configuration.
It is possible that the algorithm ends in a finite cycle of configurations
instead of a unique final configuration.
Therefore, output is a list with each element itself being a list
with the following contents:
reg_tablea regulator table, a matrix of weights for each regulator and module
modulevector of same length as genesymbols containing the
module assignments for all genes with regulators
marked as NA. Genes considered noise are marked as -1.
module_allsame as module, however, genes that were marked as
noise (-1 in module) are assigned to the
module in which it has the largest R^2,
even if it is below noise_threshold.
r2matrix of predictive R^2 value for each target gene and
module
best_r2vector of best predictive R^2 for each gene
(regulators marked with NA)
best_r2_idxmodule index corresponding to best predictive
R^2 for each gene (regulators marked with NA)
r2_modulea vector of predictive R^2 values for each
module (included if compute_predictive_r2 == TRUE)
importancea matrix of importance values for each regulator (rows)
and module (columns) (included if
compute_predictive_r2 == TRUE)
r2_cross_module_per_targeta matrix of cross module R^2
values for each target gene (rows)
and each module (columns) (included
if compute_silhouette == TRUE)
silhouettea vector of silhouette scores for each target gene
(included if compute_silhouette == TRUE)
modelsregulator selection for each module as a matrix with regulators in rows and modules in columns
signsregulator signs for each module as a matrix with regulators in rows and modules in columns
weightsaverage regulator coefficient for each module
coeffslist of regulator coefficient matrices for each module for all target genes as re-estimated in the NNLS step
sigmasmatrix of residual variances, one per target gene in each module; derived from the residuals in NNLS step
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.