ref_and_mix_pipeline: Estimate mixing proportions from reference and mixture...

Description Usage Arguments Details Value Examples

View source: R/simulation_pipelines.R

Description

Takes a mixture and reference dataframe of two-column genetic data, and a desired method of estimation for the population mixture proportions (MCMC, PB, or BH MCMC) Returns the output of the chosen estimation method

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
ref_and_mix_pipeline(
  reference,
  mixture,
  gen_start_col,
  method = "MCMC",
  reps = 2000,
  burn_in = 100,
  sample_int_Pi = 0,
  sample_int_PofZ = 0,
  sample_int_omega = 0,
  sample_int_rho = 0,
  sample_int_PofR = 0
)

Arguments

reference

a dataframe of two-column genetic format data, proceeded by "repunit", "collection", and "indiv" columns. Does not need "sample_type" column, and will be overwritten if provided

mixture

a dataframe of two-column genetic format data. Must have the same structure as reference dataframe, but "collection" and "repunit" columns are ignored. Does not need "sample_type" column, and will be overwritten if provided

gen_start_col

the first column of genetic data in both data frames

method

this must be "MCMC". "PB" and "BH" are no longer supported in this function.

reps

the number of iterations to be performed in MCMC

burn_in

how many reps to discard in the beginning of MCMC when doing the mean calculation. They will still be returned in the traces if desired.

sample_int_Pi

the number of reps between samples being taken for pi traces. If 0 no traces are taken. Only used in methods "MCMC" and "PB".

sample_int_PofZ

the number of reps between samples being taken for the posterior traces of each individual's collection of origin. If 0 no trace samples are taken. Used in all methods

sample_int_omega

the number of reps between samples being taken for collection proportion traces. If 0 no traces are taken. Only used in method "BH"

sample_int_rho

the number of reps between samples being taken for reporting unit proportion traces. If 0 no traces are taken. Only used in method "BH"

sample_int_PofR

the number of reps between samples being taken for the posterior traces of each individual's reporting unit of origin. If 0 no trace samples are taken. Only used in method "BH".

Details

"MCMC" estimates mixing proportions and individual posterior probabilities of assignment through Markov-chain Monte Carlo, while "PB" does the same with a parametric bootstrapping correction, and "BH" uses the misassignment-scaled, hierarchical MCMC. All methods use a uniform 1/(# collections or RUs) prior for pi/omega and rho.

Value

mix_proportion_pipeline returns the standard output of the chosen mixing proportion estimation method (always a list). For method "PB", returns the standard MCMC results, as well as the bootstrap-corrected collection proportions under $mean$bootstrap

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
reference <- small_chinook_ref
mixture <- small_chinook_mix
gen_start_col <- 5

# this function expects things as factors.  This function is old and needs
# to be replaced and deprecated.

reference$repunit <- factor(reference$repunit, levels = unique(reference$repunit))
reference$collection <- factor(reference$collection, levels = unique(reference$collection))
mixture$repunit <- factor(mixture$repunit, levels = unique(mixture$repunit))
mixture$collection <- factor(mixture$collection, levels = unique(mixture$collection))

mcmc <- ref_and_mix_pipeline(reference, mixture, gen_start_col, method = "MCMC")

rubias documentation built on Feb. 10, 2022, 1:06 a.m.