gwas_sim: Simulate GWAS data with known functional SNPs

Description Usage Arguments Details Value

Description

gwas_sim takes a real set of genotypes and simulates a phenotype with a known set of important SNPs. The user can select the number of SNPs that will effect the phenotype and the heretability. The function returns the phenotype, the list of SNPs that are important, a dataset that contains the genotypes with the important SNPs removed, and some other features that may be of interest. Note that the genotypes are not changed. The phenotype is created to be impacted by specified SNPs. This method preserves all of the physical characteristics of the organisms genetics.

Usage

1
gwas_sim(genotype, num_snps = 10, heretability = 0.3)

Arguments

genotype

A matrix containing the genotypes for an organism. The rows represent the subjects and the columns are the SNPs.

num_snps

The number of snps that will influence the phenotype.

heretability

A number between 0 and 1 that is the proportion variance in the phenotype explained by the genotype (R-squared).

Details

Phenotypes are computed by randomly selecting an set of SNPs that will be functional and creating random normal variables with mean 0 and variance 1 to be effects of the functional SNPs. The effects for the rest of the SNPs are 0. The phenotype is y = effect * SNP + noise. The noise is a random normal variables with mean = 0 and variance = (1 - heretability) * var(effect * SNP) / heretability.

Value

Returns a phenotype that is directly impacted by the specified SNPs and other metrics that are useful in using and interpreting the simulated data.

phenotype

A numeric vector with the phenotype for the simulated data. This is continuous, but a binary phenotype can be created by the user from the returned phenotypes.

functional_snps

Numeric vector containing the column number of the SNPs that are impacting the phenotype.

geno_snp_removed

A dataset of genotypes with the functional SNPs removed.

estimated_heretability

A measure of the heretability in the simulated dataset. This should be close to the heretability that was entered as an argument.

effect

Effect measures used as a coefficient for each important SNP.


jillbo1000/gwas3 documentation built on June 14, 2019, 3:08 a.m.