| simfam2 | R Documentation |
Generate familial time-to-event data from correlated fraily model with Kinship or/and IBD matrices given pedigree data.
simfam2(inputdata = NULL, IBD = NULL, design = "pop", variation = "none", depend = NULL,
base.dist = "Weibull", base.parms = c(0.016, 3), var_names = c("gender", "mgene"),
vbeta = c(1, 1), agemin = 20, hr = NULL)
inputdata |
Dataframe contains variables |
IBD |
IBD matrix |
design |
Family based study design used in the simulations. Possible choices are: |
variation |
Source of residual familial correlation. Possible choices are: |
depend |
Inverse of variance for the frailty distribution. A single value should be specified when |
base.dist |
Choice of baseline hazard distribution. Possible choices are: |
base.parms |
Vector of parameter values for the specified baseline hazard function. |
var_names |
Names of variables to be used in generating time-to-event data. Specified variables should be part of |
vbeta |
Vector of regression coefficients for the variables specified by |
hr |
Proportion of high risk families, which include at least two affected members, to be sampled from the two stage sampling. This value should be specified when |
agemin |
Minimum age of disease onset or minimum age. Default is 20 years of age. |
The ages at onset are generated from the correlated frailties and covariates using the following model:
The correlated shared frailty model with kinship and/or IBD matrices
h(t|X,Z) = h0(t - t0) Z exp( X*vbeta ),
where h0(t) is the baseline hazard function, t0 is a minimum age of disease onset, Z represents a vector of frailties following a multivariate log-normal distribution with mean 0 and variance 2*K*sig1 + D*sig2, where K represents the kinship matrix and D is IBD matrix, sig1 and sig2 are variance components related to each matrix and their values are specified by depend = c(1/sig1, 1/sig2), and X represents a vector of variables whose names are specified by var_names, and \beta is a vector of corresponding coefficients whose values are specified by vbeta.
The variance structure of the frailties shared within families is chosen by either variation = "kinship" or "IBD" matrix or both variation = c("kinship", "IBD").
When variation = "none", the ages at onset are independently generated from the proportional hazard model conditional on the covariates X.
The design argument defines the type of family based design to be simulated. Two variants of the population-based and clinic-based design can be chosen: "pop" when proband is affected, "pop+" when proband is affected mutation carrier, "cli" when proband is affected and at least one parent and one sibling are affected, "cli+" when proband is affected mutation-carrier and at least one parent and one sibling are affected. The two-stage design, "twostage", is used to oversample high risk families, where the proportion of high risks families to include in the sample is specified by hr. High risk families often include multiple (at least two) affected members in the family. design = "noasc" is to be used for no ascertainment correction.
Returns an object of class 'simfam', a data frame which contains inputdata and the following:
ageonset |
Ages at disease onset in years. |
time |
Ages at disease onset for the affected or ages of last follow-up for the unaffected. |
status |
Disease statuses: 1 for affected, 0 for unaffected (censored). |
fsize |
Family size including parents, siblings and children of the proband and the siblings. |
naff |
Number of affected members in family. |
weight |
Sampling weights. |
Choi, Y.-H., Briollais, L., He, W. and Kopciuk, K. (2021) FamEvent: An R Package for Generating and Modeling Time-to-Event Data in Family Designs, Journal of Statistical Software 97 (7), 1-30. doi:10.18637/jss.v097.i07
Choi, Y.-H., Kopciuk, K. and Briollais, L. (2008) Estimating Disease Risk Associated Mutated Genes in Family-Based Designs, Human Heredity 66, 238-251.
Choi, Y.-H. and Briollais (2011) An EM Composite Likelihood Approach for Multistage Sampling of Family Data with Missing Genetic Covariates, Statistica Sinica 21, 231-253.
summary.simfam2, plot.simfam, penplot
## Example: simulate family data from a population-based design using
# a Weibull distribution for the baseline hazard and inducing
# residual familial correlation through kinship and IBD matrices.
# Inputdata and IBD matrix should be provided;
# simuated inputdata as an example here;
data <- simfam(N.fam = 10, design = "noasc", variation = "none",
base.dist = "Weibull", base.parms = c(0.016, 3), vbeta = c(1, 1))
IBDmatrix <- diag(1, dim(data)[1])
data <- data[ , c(1:7, 11, 14)]
fam2 <- simfam2(inputdata = data, IBD = IBDmatrix, design = "pop",
variation = c("kinship","IBD"), depend = c(1, 1),
base.dist = "Weibull", base.parms = c(0.016, 3),
var_names = c("gender", "mgene"), vbeta = c(1,1),
agemin=20)
head(fam2)
summary(fam2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.