sequoiaSim | R Documentation |
sequoiaSim
runs simulations to estimate the error
expected when using sequoia for single parent assignments with your baseline
sequoiaSim( input_parameters, LLR_min = 0.5, parent_data_file, min_genotyped = 0.9, allele_error_rate = 0.001, Seq_MaxMismatch = NA, prefix = "" )
input_parameters |
This is a dataframe of input parameters to control the simulations. For example: parameters <- data.frame(Proportion_baseline_sampled = c(1, .9, .85, .75), Number_of_simulations_to_run = c(100, 200, 200, 200), Number_of_offspring = c(1000, 3000, 3000, 3000), Parent_expansion = c(TRUE, TRUE, TRUE, TRUE) ) |
LLR_min |
This is the minimum LLR you want to use to accept a single parent assignment |
parent_data_file |
This is either a dataframe or a file path to a csv with a header and your baseline data
in it. Each row is an individual, and columns are: PopulationName,IndividualName,Sex,SNP1,SNP2,SNP3,...
Sex is either M or F, SNPs are one call per column and one character per allele (ie, AT, TT, AA)
Individual names must be unique, even across populations. This file can be generated using
the |
min_genotyped |
This is the proportion of genotypes that must be non-missing (ie 0.9 for 90% of genotypes) to include a simulated parent or offspring in the analysis |
allele_error_rate |
This is the per allele genotyping error rate |
Seq_MaxMismatch |
This is the MaxMismatch parameter to use for Sequoia. If not specified, the default is to use 5% of the number of markers in your panel (rounded up to the nearest integer). |
prefix |
This is the prefix to add to the output file name. Default is no prefix. |
This runs simulations given the parameters you input, see the separate writeup for a full description of how the simulations are run.
This function writes its output as a csv to the working directory afeter all simulations have finished.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.