gprob_sim_gc_missing | R Documentation |
Takes a list of parameters from a genetic dataset, and returns a genotype log-likelihood matrix for individuals simulated by gene copy from the specified collections, with genotypes masked by missing data patterns from reference individuals
gprob_sim_gc_missing(par_list, sim_colls, sim_missing)
par_list |
genetic data converted to the param_list format by |
sim_colls |
a vector; element i specifies the collection from which to sample the genotypes for individual i |
sim_missing |
a vector; element i specifies the index for the individual in params$I whose missing data should be copied for individual i |
In simulation by gene copy, the genotype at a locus for any individual is the result
of two random draws from the allele count matrix of that locus. Draws within an individual
are performed without replacement, but allele counts are replaced between individuals.
If the data at a particular locus is missing for individual i in sim_missing
,
this data will also be missing in simulated individual i for the
log-likelihood calculation.
# If one wanted to simulate the missing data patterns
# of a troublesome mixture dataset, one would run tcf2param_list,
# selecting samp_type = "mixture", then draw sim_miss from
# the mixture individual genotype list
# make a fake mixture data set to demonstrate...
drawn <- mixture_draw(alewife, rhos = c(1/3, 1/3, 1/3),N = 100)
ref <- drawn$reference
mix <- drawn$mix
# then run it...
# we have to get the ploidies to pass to tcf2param_list
locnames <- names(alewife)[-(1:16)][c(TRUE, FALSE)]
ploidies <- rep(2, length(locnames))
names(ploidies) <- locnames
params <- tcf2param_list(rbind(ref,mix), 17, samp_type = "mixture", ploidies = ploidies)
sim_colls <- sample(params$C, 1070, replace = TRUE)
sim_miss <- sample(length(params$coll), 1070, replace = TRUE)
ale_sim_gprobs_miss <- gprob_sim_gc_missing(params, sim_colls, sim_miss)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.