anova_glmnet_single: Assign ANOVA type-II -log10 p-values to one submodel of a...

View source: R/anova_glmnet_single.R

anova_glmnet_singleR Documentation

Assign ANOVA type-II -log10 p-values to one submodel of a sparse glmnet model

Description

Given a sparse glmnet model (not ridge regression), here we assing ANOVA type-II -log10 p-values to a single submodel obtained by specifying providing the cross-validation glmnet object and specifying index as "min" or "1se" (see glmnet::cv.glmnet() return object), or alternatively only the beta matrix (sparse coefficients) and the numeric index of the lambda penalty factor. To achieve this, each set of selected loci is fit again to the original data without penalization.

Usage

anova_glmnet_single(
  X,
  y,
  pcs = NULL,
  obj_cv = NULL,
  beta = NULL,
  index = "min",
  ret_sparse = FALSE
)

Arguments

X

The genotype matrix. Same as was used in glmnet_pca().

y

The trait vector. Same as was used in glmnet_pca().

pcs

The PC (eigenvector) matrix (optional). Same as was used in glmnet_pca(). Unlike genotypes, PCs are not given p-values.

obj_cv

Optional, the cross-validation object produced by glmnet_pca() with cv = TRUE (must be class "cv.glmfit"). Either this or beta must be provided.

beta

Optional, the sparse matrix of coefficients (component $beta) of the glmnet object (output of glmnet_pca() with cv = FALSE, or component $glmnet.fit$beta if cv = TRUE). Either this or obj_cv must be provided.

index

The index for the desired lambda penalty factor, which implicitly chooses the submodel to analyze. This can be a numeric index corresponding to a column of beta. Alternatively, if obj_cv is provided, index can be "min" (default, selects obj_cv$index[1] as the actual index, which corresponds to the lambda that minimized the mean cross-validation error) or "1se" (obj_cv$index[2], which is the largest lambda such that error is within 1 standard error of the minimum). See glmnet::cv.glmnet() for more information on "min" and "1se" definitions.

ret_sparse

Logical that controls return value (see that).

Value

If ret_sparse = FALSE (default), returns a complete vector of scores (-log10 p-values) for every locus in X, with zeroes for all loci with zero coefficients. For loci with non-zero coefficients, p-values are calculated using anova2(), see that for more details. If ret_sparse = TRUE, returns a list of indexes and scores corresponding only to the loci with non-zero coefficients.

See Also

glmnet_pca(), particularly option cv = TRUE, for obtaining cross-validation objects with PCs.

anova_single() for scoring a model specified by locus indexes only.

anova_glmnet() for a version that calculates scores for all models (all lambdas), though it is much slower and not generally recommended.

anova2() for additional details and data restrictions.

scores_glmnet() for a different way of scoring/raking variants.

Examples

## Not run: 
# version with cross-validation object `obj_cv` (recommended)
# defaults to selecting model with lowest cross-validation error (`index = "min"`)
scores <- anova_glmnet_single( X, y, pcs, obj_cv = obj_cv )

# version with beta matrix and desired index
scores <- anova_glmnet_single( X, y, pcs, beta = beta, index = 50 )

## End(Not run)


OchoaLab/polygenr documentation built on March 18, 2022, 10:52 a.m.