LEGIT | R Documentation |

Constructs a generalized linear model (glm) with a weighted latent environmental score and weighted latent genetic score using alternating optimization.

```
LEGIT(
data,
genes,
env,
formula,
start_genes = NULL,
start_env = NULL,
eps = 0.001,
maxiter = 100,
family = gaussian,
ylim = NULL,
print = TRUE,
print_steps = FALSE,
crossover = NULL,
crossover_fixed = FALSE,
reverse_code = FALSE,
rescale = FALSE,
lme4 = FALSE
)
```

`data` |
data.frame of the dataset to be used. |

`genes` |
data.frame of the variables inside the genetic score |

`env` |
data.frame of the variables inside the environmental score |

`formula` |
Model formula. Use |

`start_genes` |
Optional starting points for genetic score (must be the same length as the number of columns of |

`start_env` |
Optional starting points for environmental score (must be the same length as the number of columns of |

`eps` |
Threshold for convergence (.01 for quick batch simulations, .0001 for accurate results). |

`maxiter` |
Maximum number of iterations. |

`family` |
Outcome distribution and link function (Default = gaussian). |

`ylim` |
Optional vector containing the known min and max of the outcome variable. Even if your outcome is known to be in [a,b], if you assume a Gaussian distribution, predict() could return values outside this range. This parameter ensures that this never happens. This is not necessary with a distribution that already assumes the proper range (ex: [0,1] with binomial distribution). |

`print` |
If FALSE, nothing except warnings will be printed (Default = TRUE). |

`print_steps` |
If TRUE, print the parameters at all iterations, good for debugging (Default = FALSE). |

`crossover` |
If not NULL, estimates the crossover point of |

`crossover_fixed` |
If TRUE, instead of estimating the crossover point of E, we force/fix it to the value of "crossover". (Used when creating a diathes-stress model) (Default = FALSE). |

`reverse_code` |
If TRUE, after fitting the model, the genes with negative weights are reverse coded (ex: |

`rescale` |
If TRUE, the environmental variables are automatically rescaled to the range [-1,1]. This improves interpretability (Default=FALSE). |

`lme4` |
If TRUE, uses lme4::lmer or lme4::glmer; Note that is an experimental feature, bugs may arise and certain functions may fail. Currently only summary(), plot(), GxE_interaction_test(), LEGIT(), LEGIT_cv() work. Also note that the AIC and certain elements ignore the existence of the genes and environment variables, thus the AIC may not be used for variable selection of the genes and the environment. However, the AIC can still be used to compare models with the same genes and environments. (Default=FALSE). |

Returns an object of the class "LEGIT" which is list containing, in the following order: a glm fit of the main model, a glm fit of the genetic score, a glm fit of the environmental score, a list of the true model parameters (AIC, BIC, rank, df.residual, null.deviance) for which the individual model parts (main, genetic, environmental) don't estimate properly and the formula.

Alexia Jolicoeur-Martineau, Ashley Wazana, Eszter Szekely, Meir Steiner, Alison S. Fleming, James L. Kennedy, Michael J. Meaney, Celia M.T. Greenwood and the MAVAN team. *Alternating optimization for GxE modelling with weighted genetic and environmental scores: examples from the MAVAN study* (2017). arXiv:1703.08111.

```
train = example_2way(500, 1, seed=777)
fit_best = LEGIT(train$data, train$G, train$E, y ~ G*E, train$coef_G, train$coef_E)
fit_default = LEGIT(train$data, train$G, train$E, y ~ G*E)
summary(fit_default)
summary(fit_best)
train = example_3way(500, 2.5, seed=777)
fit_best = LEGIT(train$data, train$G, train$E, y ~ G*E*z, train$coef_G, train$coef_E)
fit_default = LEGIT(train$data, train$G, train$E, y ~ G*E*z)
summary(fit_default)
summary(fit_best)
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.