View source: R/sim_add_de_novo_mut.R
sim_add_de_novo_mut | R Documentation |
This function adds de novo mutations (DNM) to the generated data, specifically picking the number of DNMs to add (using a poisson distribution), which donor haplotype the DNM originates from (using a uniform distribution) SNP indices from the diploid donor phased haplotypes after which the new DNM will be added (using a uniform distribution), how many gametes and which gametes could potentially be affected by the DNM because that SNP position originates from the affected donor haplotype, and from that how many gametes actually will be affected by each DNM (using a gamma distribution). Using this info, we construct each new row for the diploid donor haplotypes, giving the originating haplotype the alternate allele, and the other the reference allele for the full gamete data, giving unaffected gametes the reference allele, and the affected gametes the alternate allele, For the sparse gamete data, we replace genotypes with NAs for each new row using the missing genotype rate and a uniform distribution Finally, we track SNP indices and adjust any recombination breakpoints as necessary
sim_add_de_novo_mut( de_novo_lambda, de_novo_alpha, de_novo_beta, num_snps, num_gametes, gam_haps, gam_mat, gam_mat_with_na, donor_haps, unlist_ci, missing_genotype_rate )
de_novo_lambda |
an integer, parameterizes a poisson distribution to find the number of DNMs total |
de_novo_alpha |
a numeric, shape parameter for a gamma distribution to find the maximum number of gametes affected by each DNM |
de_novo_beta |
a numeric, scale parameter for a gamma distribution to find the maximum number of gametes affected by each DNM |
num_snps |
an integer, the number of SNPs or the number of rows, the generated data had before calling this function |
num_gametes |
an integer, the number of gametes, or the number of columns, the generated data has |
gam_haps |
data matrix/frame of the hapltoypes from which each SNP in each gamete originates (encoded as 1's and 2's), necessary to find which gametes potentially could be affected by each DNM |
gam_mat |
full data matrix/frame of the genotypes by SNP for each gamete, encoded in 0's and 1's |
gam_mat_with_na |
sparse data/ matrix/frame of the genotypes by SNP for each gamete, encoded in 0's and 1's and NAs |
donor_haps |
a data frame with the phased diploid donor haplotypes in two columns |
unlist_ci |
a named vector from unlist with the crossover break points for each gamete |
missing_genotype_rate |
a numeric, the missing genotype rate of the simulation |
out a named list with the adjusted unlist_ci
, num_snps
, gam_haps
, gam_mat
, gam_mat_with_na
, donor_haps
, as well as the new new_rows
which tracks where the new DNMs are
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.