assess_pb_bias_correction: Test the effects of the parametric bootstrap bias correction...

Description Usage Arguments Details Value Examples

View source: R/assess_pb_bias_correction.R

Description

This is a rewrite of bias_comparison(). Eric didn't want the plotting to be wrapped up in a function, and wanted to return a more informative data frame.

Usage

1
2
3
4
5
6
7
8
assess_pb_bias_correction(
  reference,
  gen_start_col,
  seed = 5,
  nreps = 50,
  mixsize = 100,
  alle_freq_prior = list(const_scaled = 1)
)

Arguments

reference

a two-column format genetic dataset, with a "repunit" column specifying each individual's reporting unit of origin, a "collection" column specifying the collection (population or time of sampling) and "indiv" providing a unique name

gen_start_col

the first column containing genetic data in reference. All columns should be genetic format following this column, and gene copies from the same locus should be adjacent

seed

the random seed for simulations

nreps

The number of reps to do.

mixsize

The size of each simulated mixture sample.

alle_freq_prior

a one-element named list specifying the prior to be used when generating Dirichlet parameters for genotype likelihood calculations. Valid methods include "const", "scaled_const", and "empirical". See ?list_diploid_params for method details.

Details

Takes a reference two-column genetic dataset, pulls a series of random "mixture" datasets with varying reporting unit proportions from this reference, and compares the results of GSI through standard MCMC vs. parametric-bootstrap MCMC bias correction

The amount of bias in reporting unit proportion calculations increases with the rate of misassignment between reporting units (decreases with genetic differentiation), and increases as the number of collections within reporting units becomes more uneven.

Output from the standard Bayesian MCMC method demonstrates the level of bias to be expected for the input data set, and parametric bootstrapping is an empirical method for the removal of any existing bias.

Value

bias_comparison returns a list; the first element is a list of the relevant rho values generated on each iteration of the random "mixture" creation. This includes the true rho value, the standard result rho_mcmc, and the parametric bootstrapped rho_pb.

The second element is a dataframe listing summary statistics for each reporting unit and estimation method. mse, the mean squared error, summarizes the deviation of the rho estimates from their true value, including both bias and other variance. mean_prop_bias is the average ratio of residual to true value, which gives greater weight to deviations at smaller values. mean_bias is simply the average residual; unlike mse, this demonstrates the direction of the bias.

Examples

1
2
3
4
5
## Not run: 
## This takes too long to run in R CMD CHECK
ale_bias <- assess_pb_bias_correction(alewife, 17)

## End(Not run)

rubias documentation built on Feb. 10, 2022, 1:06 a.m.