HAC.simrep | R Documentation |
Runs the HACSim
algorithm by successively calling
HAC.sim
to iteratively extrapolate haplotype accumulation curves to
determine likely specimen sample sizes for hypothetical or real species
The algorithm employs the following iterative methods when calculating the "Measures of Sampling Closeness":
Mean number of haplotype sampled: H_i
Mean number of haplotypes not sampled H^* - H_i
Proportion of haplotypes sampled: \frac{H_i}{H^*}
Proportion of haplotypes not sampled: 1 - \frac{H_i}{H^*}
Mean value of N*: \frac{N_iH^*}{H_i}
Mean number of specimens not sampled: \frac{N_iH^*}{H_i} - N_i
where H_i is stochastically-determined through sampling from
probs
, the observed species' haplotype frequency distribution vector.
As the algorithm proceeds, H_i will approach H^* asymptotically (and hence, N_i will converge to N^*), but will likely fluctuate randomly from one iteration to the next. However, estimates of N^* found at each iteration will be monotonically-increasing.
HAC.simrep(HACSObject)
HACSObject |
object containing the desired simulation parameters |
Iteration results are outputted to the console and graphs displayed in
the plot window. Plots depict haplotype accumulation (along with shaded
confidence intervals for the mean number of haplotypes found). Dashed lines
correspond to the endpoint of the curve and reflect haplotype recovery for a
user-defined cutoff (default p
= 0.95, 95% haplotype diversity). Output
from the first iteration is useful for judging levels of haplotype diversity
and recovery found in observed intraspecific sequence datasets, reflecting
current sampling depth. The required sample size is displayed in the second-
last iteration. All other information corresponding to the extrapolated sample
size can be found in the last iteration. Iteration results can optionally be
saved to a CSV file. Subsampled DNA sequences are automatically saved to a
FASTA file.
When simulating real species via HACReal(...)
, a pop-up window will appear prompting the user to select an intraspecific FASTA file of
aligned/trimmed DNA sequences. The alignment must not contain missing or
ambiguous nucleotides (i.e., it should only contain A, C, G or T); otherwise, haplotype diversity may be overestimated. Excluding sequences or alignment
sites with missing/ambiguous data is an option.
## Simulate hypothetical species ## N <- 100 # total number of sampled individuals Hstar <- 10 # total number of haplotypes probs <- rep(1/Hstar, Hstar) # equal haplotype frequency distribution HACSObj <- HACHypothetical(N = N, Hstar = Hstar , probs = probs, filename = "output") # outputs a CSV file called "output.csv" ## Simulate hypothetical species - subsampling ## HACSObj <- HACHypothetical(N = N, Hstar = Hstar, probs = probs, perms = 1000, p = 0.95, subsample = TRUE, prop = 0.25, conf.level = 0.95, filename = "output") ## Simulate hypothetical species and all paramaters changed - subsampling ## HACSObj <- HACHypothetical(N = N, Hstar = Hstar, probs = probs, perms = 10000, p = 0.90, subsample = TRUE, prop = 0.15, conf.level = 0.95, filename = "output") try(HAC.simrep(HACSObj)) # runs a simulation ## Simulate real species ## ## Not run: ## Simulate real species ## # outputs file called "output.csv" HACSObj <- HACReal(filename = "output") ## Simulate real species - subsampling ## HACSObj <- HACReal(subsample = TRUE, prop = 0.15, conf.level = 0.95, filename = "output") ## Simulate real species and all parameters changed - subsampling ## HACSObj <- HACReal(perms = 10000, p = 0.90, subsample = TRUE, prop = 0.15, conf.level = 0.99, filename = "output") # user prompted to select appropriate FASTA file try(HAC.simrep(HACSObj)) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.