View source: R/sim_run_generative_model.R
sim_run_generative_model | R Documentation |
This function runs the generative model to simulate input sparse gamete data for rhapsodi. In addition, to returning the sparse gamete data, the function also returns the fully known generated gamete data, the diploid donor phased haplotypes, and the true recombination break points for each gamete. The following variables of the simulation can all be controlled: the number of gametes, the number of SNPs, the sequencing coverage (or missing genotype rate), the average recombination rate, whether to simulate sequencing error, the sequencing error rate to use, whether to add de novo mutations, values parameterizing how many de novo mutations there are and how many gametes are affected by the de novo mutations, and the random seed for reproducibility
sim_run_generative_model( num_gametes, num_snps, coverage, recomb_lambda, random_seed = 42, input_cov = TRUE, input_mgr = FALSE, missing_genotype_rate = NULL, add_seq_error = TRUE, seqError_add = 0.005, add_de_novo_mut = FALSE, de_novo_lambda = 5, de_novo_alpha = 7.5, de_novo_beta = 10 )
num_gametes |
an integer, the number of gametes, or the number of columns for the sparse gamete data you want generated |
num_snps |
an integer, the number of SNPs, or the number of rows for the sparse gamete data you want generated. Note: not all of these will be heterozygous due to the coverage and therefore this number won't necessarily equal the number of SNPs following filtering at the end of the generation |
coverage |
a numeric, input if input_cov is TRUE, suggested NULL otherwise |
recomb_lambda |
a numeric, the average rate of recombination expected for the simulation |
random_seed |
an integer, the random seed which will be set for the simulation, default=42 |
input_cov |
a logical, TRUE if coverage (i.e. like 0.01 (x)) will be input rather than missing genotype rate |
input_mgr |
a logical, TRUE if missing genotype rate (i.e. like 80 (%) or 0.8) will be inpupt rather than coverage, default = FALSE |
missing_genotype_rate |
a numeric, input if input_mgr is TRUE and input_COV is FALSE, suggested NULL otherwise, default=NULL |
add_seq_error |
a logical, TRUE if you want to add sequencing error to the generated data, default=TRUE |
seqError_add |
a numeric, the sequencing error rate if adding sequencing error to the generated data, default=0.005 |
add_de_novo_mut |
a logical, TRUE if you want to add de novo mutations to the generated data, default=FALSE |
de_novo_lambda |
an integer, default=5, parameterizes a poisson distribution to find the number of de novo mutations (DNM) total |
de_novo_alpha |
a numeric, default=7.5, shape parameter for a gamma distribution to find the number of gametes affected per DNM |
de_novo_beta |
a numeric, default=10, scale parameter for a gamma distribution to find the number of gametes affected per DNM |
generated_data a named list returning the generated input and full truth data, specifically gam_na
for the sparse rhapsodi input, gam_full
for the fully known gamete data input equivalent, recomb_spots
for the true recombination spots for each gamete, and donor_haps
for the diploid donor phased haplotypes
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.