# Introduction to the ggmix package In ggmix: Variable Selection in Linear Mixed Models for SNP Data

## Fit the linear mixed model with Lasso Penalty

We will use the most basic call to the main function of this package ggmix. This function by default will fit a $L_1$ penalized linear mixed model (LMM) for 100 distinct values of the tuning parameter $\lambda$. It will choose its own sequence:

fit <- ggmix(x = admixed$xtrain, y = admixed$ytrain,
kinship = admixed$kin_train) names(fit) class(fit) We can see the solution path for each variable by calling the plot method for objects of class ggmix_fit: plot(fit) We can also get the coefficients for given value(s) of lambda using the coef method for objects of class ggmix_fit: coef(fit, s = c(0.1,0.02)) We can also get predictions ($X\widehat{\boldsymbol{\beta}}$) using the predict method for objects of class ggmix_fit: # need to provide x to the predict function head(predict(fit, s = 0.01, newx = admixed$xtest))

## Find the Optimal Value of the Tuning Parameter

We use the Generalized Information Criterion (GIC) to select the optimal value for $\lambda$. The GIC takes the form

[GIC_{\lambda} = -2 \ell(\widehat{\boldsymbol{\beta}}, \widehat{\sigma}^2, \widehat{\eta}) + a_n \cdot \widehat{df}_{\lambda}]

where $\ell(\cdot)$ is the log-likelihood evaluated at the converged values of the parameters, $\widehat{df}{\lambda}$ is the number of non-zero elements in $\widehat{\boldsymbol{\beta}}{\lambda}$ plus two (representing the variance parameters $\eta$ and $\sigma^2$), and $a_n$ is a non-negative penalty parameter. The BIC has $a_n = \log(n)$, and AIC has $a_n = 2$. The user can specify the value of $a_n$ that they want. The default is $a_n = log(log(n)) * log(p)$ which corresponds to a high-dimensional BIC (HDBIC):

# pass the fitted object from ggmix to the gic function:
hdbic <- gic(fit)
class(hdbic)

# we can also fit the BIC by specifying the an argument

## Diagnostic Plots

We can also plot some standard diagnostic plots such as the observed vs. predicted response, QQ-plots of the residuals and random effects and the Tukey-Anscombe plot. These can be plotted using the plot method on a ggmix_gic object as shown below.

### Observed vs. Predicted Response

plot(hdbic, type = "predicted", newx = admixed$xtrain, newy = admixed$ytrain)

### QQ-plots for Residuals and Random Effects

plot(hdbic, type = "QQranef", newx = admixed$xtrain, newy = admixed$ytrain)
plot(hdbic, type = "QQresid", newx = admixed$xtrain, newy = admixed$ytrain)

### Tukey-Anscombe Plot

plot(hdbic, type = "Tukey", newx = admixed$xtrain, newy = admixed$ytrain)

## Try the ggmix package in your browser

Any scripts or data that you put into this service are public.

ggmix documentation built on April 13, 2021, 9:06 a.m.