Description Usage Arguments Details Value See Also Examples

Function that generates data of the different simulation studies
presented in the accompanying paper. This function requires the
`popkin`

and `bnpsd`

package to be installed.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | ```
gen_structured_model(
n,
p_design,
p_kinship,
k,
s,
Fst,
b0,
nPC = 10,
eta,
sigma2,
geography = c("ind", "1d", "circ"),
percent_causal,
percent_overlap,
train_tune_test = c(0.6, 0.2, 0.2)
)
``` |

`n` |
number of observations to simulate |

`p_design` |
number of variables in X_test, i.e., the design matrix |

`p_kinship` |
number of variable in X_kinship, i.e., matrix used to calculate kinship |

`k` |
number of intermediate subpopulations. |

`s` |
the desired bias coefficient, which specifies sigma indirectly. Required if sigma is missing |

`Fst` |
The desired final FST of the admixed individuals. Required if sigma is missing |

`b0` |
the true intercept parameter |

`nPC` |
number of principal components to include in the design matrix used for regression adjustment for population structure via principal components. This matrix is used as the input in a standard lasso regression routine, where there are no random effects. |

`eta` |
the true eta parameter, which has to be |

`sigma2` |
the true sigma2 parameter |

`geography` |
the type of geography for simulation the kinship matrix.
"ind" is independent populations where every individuals is actually
unadmixed, "1d" is a 1D geography and "circ" is circular geography.
Default: "ind". See the functions in the |

`percent_causal` |
percentage of |

`percent_overlap` |
this represents the percentage of causal SNPs that will also be included in the calculation of the kinship matrix |

`train_tune_test` |
the proportion of sample size used for training tuning parameter selection and testing. default is 60/20/20 split |

The kinship is estimated using the `popkin`

function from the
`popkin`

package. This function will multiple that kinship matrix by 2
to give the expected covariance matrix which is subsequently used in the
linear mixed models

A list with the following elements

- ytrain
simulated response vector for training set

- ytune
simulated response vector for tuning parameter selection set

- ytest
simulated response vector for test set

- xtrain
simulated design matrix for training set

- xtune
simulated design matrix for tuning parameter selection set

- xtest
simulated design matrix for testing set

- xtrain_lasso
simulated design matrix for training set for lasso model. This is the same as xtrain, but also includes the nPC principal components

- xtune_lasso
simulated design matrix for tuning parameter selection set for lasso model. This is the same as xtune, but also includes the nPC principal components

- xtest
simulated design matrix for testing set for lasso model. This is the same as xtest, but also includes the nPC principal components

- causal
character vector of the names of the causal SNPs

- beta
the vector of true regression coefficients

- kin_train
2 times the estimated kinship for the training set individuals

- kin_tune_train
The covariance matrix between the tuning set and the training set individuals

- kin_test_train
The covariance matrix between the test set and training set individuals

- Xkinship
the matrix of SNPs used to estimate the kinship matrix

- not_causal
character vector of the non-causal SNPs

- PC
the principal components for population structure adjustment

1 2 3 4 5 6 7 8 9 10 11 | ```
admixed <- gen_structured_model(n = 100,
p_design = 50,
p_kinship = 5e2,
geography = "1d",
percent_causal = 0.10,
percent_overlap = "100",
k = 5, s = 0.5, Fst = 0.1,
b0 = 0, nPC = 10,
eta = 0.1, sigma2 = 1,
train_tune_test = c(0.8, 0.1, 0.1))
names(admixed)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.