Description Usage Arguments Details Value See Also Examples
View source: R/createphenotypeFunctions.R
runSimulation wraps around setModel, the phenotype component functions (genFixedEffects, genBgEffects, noiseBgEffects, noiseFixedEffects and correlatedBgEffects), rescales each component and combines them into the final phenotype. For details to all parameters, see the respective functions.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | runSimulation(
N,
P,
genVar = NULL,
h2s = NULL,
theta = 0.8,
h2bg = NULL,
eta = 0.8,
noiseVar = NULL,
rho = NULL,
delta = NULL,
gamma = 0.8,
phi = NULL,
alpha = 0.8,
tNrSNP = 5000,
cNrSNP = 20,
SNPfrequencies = c(0.1, 0.2, 0.4),
genotypefile = NULL,
format = "delim",
genoFilePrefix = NULL,
genoFileSuffix = NULL,
genoDelimiter = ",",
skipFields = NULL,
header = FALSE,
probabilities = FALSE,
chr = NULL,
NrSNPsOnChromosome = NULL,
NrChrCausal = NULL,
kinshipfile = NULL,
kinshipHeader = FALSE,
kinshipDelimiter = ",",
standardise = TRUE,
distBetaGenetic = "norm",
mBetaGenetic = 0,
sdBetaGenetic = 1,
pTraitsAffectedGenetics = 1,
pIndependentGenetic = 0.4,
pTraitIndependentGenetic = 0.2,
keepSameIndependentSNPs = FALSE,
NrFixedEffects = 1,
NrConfounders = 10,
distConfounders = "norm",
mConfounders = 0,
sdConfounders = 1,
catConfounders = NULL,
probConfounders = NULL,
distBetaConfounders = "norm",
mBetaConfounders = 0,
sdBetaConfounders = 1,
pTraitsAffectedConfounders = 1,
pIndependentConfounders = 0.4,
pTraitIndependentConfounders = 0.2,
keepSameIndependentConfounders = FALSE,
pcorr = 0.8,
corrmatfile = NULL,
meanNoiseBg = 0,
sdNoiseBg = 1,
nonlinear = NULL,
logbase = 10,
expbase = NULL,
power = NULL,
customTransform = NULL,
transformNeg = "abs",
proportionNonlinear = 0,
sampleID = "ID_",
phenoID = "Trait_",
snpID = "SNP_",
seed = 219453,
verbose = FALSE
)
|
N |
Number [integer] of samples to simulate. |
P |
Number [integer] of phenotypes to simulate. |
genVar |
Proportion [double] of total genetic variance. |
h2s |
Proportion [double] of genetic variance of genetic variant effects. |
theta |
Proportion [double] of variance of shared genetic variant effects. |
h2bg |
Proportion [double] of genetic variance of infinitesimal genetic effects; either h2s or h2bg have to be specified and h2s + h2bg = 1. |
eta |
Proportion [double] of variance of shared infinitesimal genetic effects. |
noiseVar |
Proportion [double] of total noise variance. |
rho |
Proportion [double] of noise variance of correlated effects; sum of rho, delta and phi has to be equal 1. |
delta |
Proportion [double] of noise variance of non-genetic covariate effects; sum of rho, delta and phi has to be equal 1. |
gamma |
Proportion [double] of variance of shared non-genetic covariate effects. |
phi |
Proportion [double] of noise variance of observational noise effects; sum of rho, delta and phi has to be equal 1. |
alpha |
Variance [double] of shared observational noise effect. |
tNrSNP |
Total number [integer] of SNPs to simulate; these SNPs are used for kinship estimation. |
cNrSNP |
Number [integer] of causal SNPs; used as genetic variant effects. |
SNPfrequencies |
Vector of allele frequencies [double] from which to sample. |
genotypefile |
Needed when reading external genotypes (into memory), path/to/genotype file [string] in format specified by format. |
format |
Needed when reading external genotypes, specifies the format of the genotype data; has to be one of plink, oxgen, genome, bimbam and delim when reading files into memory, or one of oxgen, bimbam or delim if sampling genetic variants from file; for details see readStandardGenotypes and getCausalSNPs. |
genoFilePrefix |
Needed when sampling cuasal SNPs from file, full path/to/chromosome-wise-genotype-file-ending-before-"chrChromosomeNumber" (no '~' expansion!) [string] |
genoFileSuffix |
Needed when sampling causal SNPs from file, following chromosome number including fileformat (e.g. ".csv") [string] |
genoDelimiter |
Field separator [string] of genotypefile or genoFile if format == delim. |
skipFields |
Number [integer] of fields (columns) in to skip in genoFilePrefix-genoFileSuffix-file. See details in getCausalSNPs if format == delim. |
header |
[logical] Can be set to indicate if genoFilePrefix-genoFileSuffix file has a header for format == 'delim'. See details in getCausalSNPs. |
probabilities |
[bool]. If set to TRUE, the genotypes in the files described by genoFilePrefix and genoFileSuffix are provided as triplets of probablities (p(AA), p(Aa), p(aa)) and are converted into their expected genotype frequencies by 0*p(AA) + p(Aa) + 2p(aa) via probGen2expGen. |
chr |
Numeric vector of chromosomes [integer] to chose NrCausalSNPs from; only used when external genotype data is sampled i.e. !is.null(genoFilePrefix) |
NrSNPsOnChromosome |
Specifies the number of SNPs [integer] per entry in chr (see above); has to be the same length as chr. If not provided, lines in genoFilePrefix-genoFileSuffix file will be counted (which can be slow for large files). |
NrChrCausal |
Number [integer] of causal chromosomes to chose NrCausalSNPs from (as opposed to the actual chromosomes to chose from via chr ); only used when external genotype data is sampled i.e. !is.null(genoFilePrefix). |
kinshipfile |
path/to/kinshipfile [string]; if provided, kinship for simulation of genetic backgound effect will be read from file. |
kinshipHeader |
[boolean] If TRUE kinship file has header information. |
kinshipDelimiter |
Field separator [string] of kinship file. |
standardise |
[boolean] If TRUE genotypes will be standardised for kinship estimation (recommended). |
distBetaGenetic |
Name [string] of distribution to use to simulate effect sizes of genetic variants; one of "unif" or "norm". |
mBetaGenetic |
Mean/midpoint [double] of normal/uniform distribution for effect sizes of genetic variants. |
sdBetaGenetic |
Standard deviation/extension from midpoint [double] of normal/uniform distribution for effect sizes of genetic variants. |
pTraitsAffectedGenetics |
Proportion [double] of traits affected by the genetic variant effect. For non-integer results of pTraitsAffected*P, the ceiling of the result is used. Allows to simulate for instance different levels of pleiotropy. |
pIndependentGenetic |
Proportion [double] of genetic variant effects to have a trait-independent fixed effect. |
pTraitIndependentGenetic |
Proportion [double] of traits influenced by independent genetic variant effects. |
keepSameIndependentSNPs |
[boolean] If set to TRUE, the independent SNPs effects always influence the same subset of traits. |
NrFixedEffects |
Number [integer] of different non-genetic covariate effects to simulate; allows to simulate non-genetic covariate effects from different distributions or with different parameters. |
NrConfounders |
Number [integer] of non-genetic covariates; used as non-genetic covariate effects. |
distConfounders |
Vector of name(s) [string] of distributions to use to simulate confounders; one of "unif", "norm", "bin", "cat_norm", "cat_unif". |
mConfounders |
Vector of mean(s)/midpoint(s) [double] of normal/uniform distribution for confounders. |
sdConfounders |
Vector of standard deviation(s)/extension from midpoint(s) [double] of normal/uniform distribution for confounders. |
catConfounders |
Vector of confounder categories [factor]; required if distConfounders "cat_norm" or "cat_unif". |
probConfounders |
Vector of probability(ies) [double] of binomial confounders (0/1); required if distConfounders "bin". |
distBetaConfounders |
Vector of name(s) [string] of distribution to use to simulate effect sizes of confounders; one of "unif" or "norm". |
mBetaConfounders |
Vector of mean(s)/midpoint(s) [double] of normal/uniform distribution for effect sizes of confounders. |
sdBetaConfounders |
Vector of standard deviation(s)/extension from midpoint(s) [double] of normal/uniform distribution for effect sizes of confounders. |
pTraitsAffectedConfounders |
Proportion(s) [double] of traits affected by the non-genetic covariates. For non-integer results of pTraitsAffected*P, the ceiling of the result is used. |
pIndependentConfounders |
Vector of proportion(s) [double] of non-genetic covariate effects to have a trait-independent effect. |
pTraitIndependentConfounders |
Vector of proportion(s) [double] of traits influenced by independent non-genetic covariate effects. |
keepSameIndependentConfounders |
[boolean] If set to TRUE, the independent confounder effects always influence the same subset of traits. |
pcorr |
Correlation [double] between phenotypes. |
corrmatfile |
path/to/corrmatfile.csv [string] with comma-separated [P x P] numeric [double] correlation matrix; if provided, correlation matrix for simulation of correlated backgound effect will be read from file; file should NOT contain an index or header column. |
meanNoiseBg |
Mean [double] of the normal distributions for the simulation observational noise effects. |
sdNoiseBg |
Standard deviation [double] of the normal distributions for the simulations of the observational noise effects. |
nonlinear |
nonlinear transformation method [string]; one exp (exponential), log (logarithm), poly (polynomial), sqrt (squareroot) or custom (user-supplied function); if log or exp, base can be specified; if poly, power can be specified; if custom, a custom function (see for details). Non-linear transformation is optional, default is NULL ie no transformation (see details). |
logbase |
[int] base of logarithm for non-linear phenotype transformation (see details). |
expbase |
[int] base of exponential function for non-linear phenotype transformation (see details). |
power |
[double] power of polynomial function for non-linear phenotype transformation. |
customTransform |
[function] custom transformation function accepting a single argument. |
transformNeg |
[string] transformation method for negative values in non linear phenotype transformation. One of abs (absolute value) or set0 (set all negative values to zero). If nonlinear==log and transformNeg==set0, negative values set to 1e-5 |
proportionNonlinear |
[double] proportion of the phenotype to be non- linear (see details) |
sampleID |
Prefix [string] for naming samples (will be followed by sample number from 1 to N when constructing sample IDs); only used if genotypes/kinship are simulated/do not have sample IDs. |
phenoID |
Prefix [string] for naming traits (will be followed by phenotypes number from 1 to P when constructing phenotype IDs). |
snpID |
Prefix [string] for naming SNPs (will be followed by SNP number from 1 to NrSNP when constructing SNP IDs). |
seed |
Seed [integer] to initiate random number generation. |
verbose |
[boolean]; If TRUE, progress info is printed to standard out |
Phenotypes are modeled under a linear additive model where Y = WA + BX + G + C + Phi, with WA the non-genetic covariates, BX the genetic variant effects, G the infinitesimal genetic effects, C the correlated background effects and the Phi the observational noise. For more information on these components look at the respective function descriptions (see also) Optionally the phenotypes can be non-linearly transformed via: Y_trans = (1-alpha) x Y + alpha x f(Y). Alpha is the proportion of non- linearity of the phenotype and f is a non-linear transformation, and one of exp, log or sqrt.
Named list of i) dataframe of proportion of variance explained for each component (varComponents), ii) a named list with the final simulated phenotype components (phenoComponentsFinal), iii) a named list with the intermediate simulated phenotype components (phenoComponentsIntermediate), iv) a named list of parameters describing the model setup (setup) and v) a named list of raw components (rawComponents) used for genetic effect simulation (genotypes and/or kinship, eigenvalues and eigenvectors of kinship)
setModel, geneticFixedEffects, geneticBgEffects, noiseBgEffects, noiseFixedEffects, correlatedBgEffects and rescaleVariance.
1 2 3 4 5 | # simulate phenotype of 100 samples, 10 traits from genetic and noise
# background effects, with variance explained of 0.2 and 0.8 respectively
genVar = 0.2
simulatedPhenotype <- runSimulation(N=100, P=5, cNrSNP=10,
genVar=genVar, h2s=1, phi=1)
|
Set seed: 219453
The total noise variance (noiseVar) is: 0.8
The noise model is: noiseBgOnly
Proportion of random noise variance (phi): 1
Variance of shared random noise effect (alpha): 0.8
The total genetic variance (genVar) is: 0.2
The genetic model is: geneticFixedOnly
Proportion of variance of fixed genetic effects (h2s): 1
Proportion of variance of shared fixed genetic effects (theta): 0.8
Proportion of fixed genetic effects to have a trait-independent fixed effect (pIndependentGenetic): 0.4
Proportion of traits influenced by independent fixed genetic effects (pTraitIndependentGenetic): 0.2
Simulate noise terms (noise model: noiseBgOnly )
Simulate noise background effects
Simulate genetic effects (genetic model: geneticFixedOnly )
Simulate 5000 SNPs...
Simulate genetic fixed effects
Construct final simulated phenotype
Put all phenotype components together...
Warning message:
In runSimulation(N = 100, P = 5, cNrSNP = 10, genVar = genVar, h2s = 1, :
The genetic model does not contain random effects but the total number of SNPs to simulate (tNrSNP: 5000 ) is larger than the causal number of SNPs (cNrSNP: 10 ). If genotypes are not needed, consider setting tNrSNP=cNrSNP to speed up computation
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.