sim.fit.stick.data.batch: Simulate and fit batch data under stickbreaking model

Description Usage Arguments Details Value Examples

View source: R/simulate_data_fnxs.R

Description

Simulate and fit batch data under stickbreaking model

Usage

1
2
3
4
sim.fit.stick.data.batch(mut.vals, coe.vals, sig.vals, d.true,
  d.range = c(0.1, 10), w.wt, n.reps.ea = 100, print.status = FALSE,
  fit.methods = c("seq"), outdir, wts = c(2, 1), d.max.adj = 1.1,
  run.regression = TRUE, RDB.method)

Arguments

mut.vals

Vector of number of mutations to simulate

coe.vals

Vector of stickbreaking coefficients to simulate

sig.vals

Vector of sigma values to simulate

d.true

The distance to the boundary to use in simulations (d)

d.range

Vector of range values. When estimating d under MLE, what range should be searched (see details). Default is c(0.1, 10).

w.wt

Fitness of the wildtype

n.reps.ea

Number of replicates per parametric condition

print.status

TRUE/FALSE. Should loop counters be printed.

fit.methods

Vector of all methods of estimating d to then fit model and output results. Accepts "MLE", "RDB", "max", "seq", "RDB.all" and "All". "All" does all methods. Default is "seq". Case sensitive.

outdir

The path to write output files to (see details about file names).

wts

The weight assigned to wildtype vs other genotypes when estimating parameters (see details). Default c(2,1).

d.max.adj

When forced to use the maximum estimator, the estimate is adjusted upwards by this factor (see details). Default = 1.1 (inflate observation 10%).

run.regression

TRUE/FALSE Run regression analysis when fitting model. See details.

RDB.method

Indicates which RDB method to use when doing sequential estimation. Options are "pos" and "all". "pos" option is better when mutations are strictly beneficial: "all" is appropriate when some or all mutations are deleterious.

Details

Function contains a loop for combining each value of mut.vals, coe.vals and sig.vals, generating data under the stickbreaking model and then fitting it. The fit.methods argument allows user to evaluate performance of multiple methods at one time. Note that estimate of d under MLE is restricted to d.range. Using a reasonable upper bound here is valuable so that the stickbreaking model remains distant from the additive model (i.e. as d gets large and the stickbreaking coefficients get small, the stickbreaking model converges to the additive model).

Results are written to files; the name of the output files are formed by concatenating the outpath argument to the item in the fit.methods. Separate files are generated for each method (e.g. MLE, RDB, seq). Separate files are also generated for each number of mutations (because the dimensionality of the output file changes with the number of mutations). The output files are tab-delimited text files with one row per replicate. The first 5 columns provide the parameter values and the rest of the columns give parameter estimates and measures of fit.

wts: The coefficient estimates are obtained by weighted comparisons. The default is to give wild type to single mutation genotype comparisons twice the weight as all other comparisons based on the assumption that wild type is know with much lower error than the other genotypes (actually it is assumed to be known with no error).

run.regression If you are doing simulations to assess parameter estimation only, you don't need to run regression. If you are using this function to generate data for model fitting, then this should be set to TRUE.

Value

Nothing. Instead results are written to output files and deposited in inst/extdata. The files are named by appending the method

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
## Not run: 
sim.fit.stick.data.batch(c(3,4,5),
  c(0.1, 0.3, 0.5),
  c(0.02, 0.05, 0.08),
  1,
  c(0.1, 10),
  1,
  10,
  print.status=FALSE,
  fit.methods="seq",
  outdir="~/Desktop",
  c(2,1),
  1.0,
  run.regression="FALSE",
  RDB.method="pos")
  
## End(Not run)

Stickbreaker documentation built on May 29, 2017, 9:01 a.m.