multisnpnet | R Documentation |
Fit a sparse reduced rank regression model on large-scale SNP data and multivariate responses with batch variable screening and alternating minimization. It computes a full solution path on a grid of penalty values. Can deal with larger-than-memory SNP data, missing values and adjustment covariates.
multisnpnet(genotype_file, phenotype_file, phenotype_names, binary_phenotypes = NULL,
covariate_names, rank, nlambda = 100, lambda.min.ratio = 0.01, standardize_response = TRUE,
weight = NULL, validation = FALSE, split_col = NULL, mem = NULL,
batch_size = 100, prev_iter = 0, max.iter = 10, configs = NULL, save = TRUE,
early_stopping = FALSE)
genotype_file |
Path to the suite of genotype files. genotype_file.pgen, psam, pvar.zst must exist. |
phenotype_file |
Path to the phenotype. The header must include FID, IID, covariate_names and phenotype_names. Missing values are expected to be encoded as -9. |
binary_phenotypes |
Names of the binary phenotypes. AUC will be evaluated for binary phenotypes. |
covariate_names |
Character vector of the names of the adjustment covariates. |
rank |
Target rank of the model. |
nlambda |
Number of penalty values. |
lambda.min.ratio |
Ratio of the minimum penalty to the maximum penalty. |
standardize_response |
Boolean. Whether to standardize the responses before fitting to deal with potential different units of the responses. |
weight |
Numberic vector that specifies the (importance) weights for the responses. |
p.factor |
Named vector of separate penalty factors applied to each coefficient. This is a
number that multiplies |
validation |
Boolean. Whether to evaluate on validation set. |
split_col |
Name of the column in the phenotype file that specifies whether each sample belongs to the training split or the validation split. The values are either "train" or "val". |
mem |
Memory available for the program. It tells PLINK 2.0 the amount of memory it can harness for the computation. IMPORTANT if using a job scheduler. |
batch_size |
Number of variants used in batch screening. |
prev_iter |
Index of the iteration to start from (e.g. to resume a previously interrupted computation). |
max.iter |
Maximum number of iterations allowed for alternating minimization. |
configs |
List of additional configuration parameters. It can include:
|
save |
Boolean. Whether to save intermediate results. |
early_stopping |
Whether to stop the process early if validation metric starts to fall. |
early_stopping_phenotypes |
List of phenotypes to focus when evaluating the early stopping condition. |
early_stopping_check_average |
whether to check the average metric when evaluating the early stopping condition |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.