In sahirbhatnagar/penfam: Variable Selection in Linear Mixed Models for SNP Data

library(knitr)
code <- paste0("/mnt/GREENWOOD_BACKUP/home/sahir.bhatnagar/ggmix/simulation/",
               c("model_functions.R", "method_functions.R", "eval_functions.R", "main.R"))
code_lastmodified <- max(file.info(code)$mtime)
sapply(code, read_chunk)
# knitr:::knit_code$get()

This is a knitr report generated by the simulator to describe your simulation. Knitting this file will rerun the simulation if any of the code files have been modified since the simulation object was last created.

```{css, echo = FALSE} pre code, pre, code { white-space: pre !important; overflow-x: scroll !important; word-break: keep-all !important; word-wrap: initial !important; }

# Main simulation

## Simulation Details

To assess the performance of penfam we used genotyped data from the UK Biobank cohort to maintain LD structure. We restricted our simulation study to 1st degree relatives defined by the KING estimate for kinship coefficients resulting in $n=1069$ individuals. Note that not all of these individuals are related to eachother. This just means that each individual is related at least to 1 other person. 
We define the following quantities:

1) $c$: percentage of causal SNPs  
2) $\mathbf{X}^{(test)}$: $n \times 1000$ matrix of SNPs that have been randomly sampled across the genome, with sampling weights proportional to the size of each chromosome. These are the SNPs that will be included as fixed effects in our model.  
3) $\mathbf{X}^{(causal)}$: $n \times (c*1000)$ matrix of SNPs out of the SNPs included in the fixed effect model that will be truly associated with the simulated phenotype, where $\mathbf{X}^{(causal)} \subseteq \mathbf{X}^{(test)}$  
4) $\mathbf{X}^{(other)}$: $n \times 8000$ matrix of SNPs that have been randomly sampled across the genome, with sampling weights proportional to the size of each chromosome. This matrix will be used in the construction of the kinship matrix ($\boldsymbol{\Phi}$). Some of these $\mathbf{X}^{(other)}$ SNPs, in conjunction with some of the SNPs in $\mathbf{X}^{(test)}$ will be used in construction of the kinship matrix. We will alter the balance between these two contributors and with the proportion of causal SNPs used to calculate kinship.   
5) $\mathbf{X}^{(kinship)}$: $n \times k$ matrix of SNPs used to construct the kinship matrix.  
6) $\beta_j$: effect size for the $j^{th}$ SNP, simulated from a $Uniform(0.9,1.1)$ for $j = 1, \ldots, (c*1000)$  
7) The response is simulated as follows:

\[\mathbf{Y} | (\boldsymbol{\beta}, \eta, \sigma^2) \sim \mathcal{N}(\mu, \eta \sigma^2 \boldsymbol{\Phi} + (1-\eta)\sigma^2 \boldsymbol{I})\]


We will consider the following simulation scenarios. In each scenario we consider $c = 0.01$, $\eta=0.1, 0.5$ and $\sigma^2 = 4$ :

**Scenario 1**  
All the causal SNPs are included in the calculation of the kinship matrix.

$\mathbf{X}^{(kinship)} = \left[\mathbf{X}^{(other)} ; \mathbf{X}^{(causal)}\right]$

**Scenario 2**  
None of the causal SNPs are included in the calculation of the kinship matrix.

$\mathbf{X}^{(kinship)} = \left[\mathbf{X}^{(other)} \right]$


## Code used to produce results

```r
library(simulator)

<<models>>
<<methods>>
<<metrics>>

<<init>>
<<main>>

<<init>>
sim <- load_simulation(name_of_simulation)

Results

<<plots>>

Mean True Positive Rate (standard error) over 200 simulations

<<tpr>>

Mean False Positive Rate (standard error) over 200 simulations

<<fpr>>

Mean Number of Active Variables (standard error) over 200 simulations

Active variables: The number of variables with a non-zero estimated coefficient for a given model

<<nactive>>

Mean Model Error (standard error) over 200 simulations

Model Error: [\Vert \mathbf{X}\beta - \mathbf{X}\hat\beta \Vert_2 ]

<<model-error>>

Estimation of variance components for ggmix only

<<plots>>

Components

Models

<<models>>

Methods

<<methods>>

Metrics

<<metrics>>

References

citation(c("simulator"))
citation(c("glmnet"))
citation(c("gaston"))
citation(c("magrittr"))
citation(c("ggplot2"))
citation(c("knitr"))

sahirbhatnagar/penfam documentation built on April 14, 2021, 9:38 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

sahirbhatnagar/penfam
Variable Selection in Linear Mixed Models for SNP Data

In sahirbhatnagar/penfam: Variable Selection in Linear Mixed Models for SNP Data

Results

Mean True Positive Rate (standard error) over 200 simulations

Mean False Positive Rate (standard error) over 200 simulations

Mean Number of Active Variables (standard error) over 200 simulations

Mean Model Error (standard error) over 200 simulations

Estimation of variance components for ggmix only

Components

Models

Methods

Metrics

References

R Package Documentation

Browse R Packages

We want your feedback!

sahirbhatnagar/penfam Variable Selection in Linear Mixed Models for SNP Data

In sahirbhatnagar/penfam: Variable Selection in Linear Mixed Models for SNP Data

Results

Mean True Positive Rate (standard error) over 200 simulations

Mean False Positive Rate (standard error) over 200 simulations

Mean Number of Active Variables (standard error) over 200 simulations

Mean Model Error (standard error) over 200 simulations

Estimation of variance components for ggmix only

Components

Models

Methods

Metrics

References

R Package Documentation

Browse R Packages

We want your feedback!

sahirbhatnagar/penfam
Variable Selection in Linear Mixed Models for SNP Data