data_mixed: Simulated mixed bivariate phenotype

Description Usage Format Source

Description

To simulate genotypes, we used the 1000 Genomes Project database. Variants within 500kbs of the BRCA1 gene, for which several known mutations are associated with a higher risk of developing breast, ovarian and prostate cancers, were selected. To avoid any multicollinearity, we pruned variants based on linkage disequilibrium r^2 > 0.7. Further, a total of 503 subjects with a European genetic ancestry are selected in order to avoid any population structure. One discrete and two continuous traits were simulated using a gaussian copula to model the joint dependence. Finally, we also simulated one discrete and one continuous covariate.

Usage

1

Format

This data frame has 503 rows and the following 35 columns:

y.bin

discrete trait simulated from a latent gaussian variable

y.gauss

continous trait simulated from a gaussian distribution

y.Gamma

continous trait simulated from a Gamma distribution

x1

intercept

x2

discrete covariate

x3

continous covariate

V1:V30

30 consecutive SNPs sampled from a random genomic region within 500kbs of BRCA1 gene found on chromosome 17

Source

https://www.internationalgenome.org/data/


julstpierre/CBMAT documentation built on Aug. 7, 2021, 9:31 p.m.