fit.stick.model.given.d: Fit the stickbreaking model to data for a given value of d

Description Usage Arguments Details Value

View source: R/model_fitting_functions.R

Description

Fit the stickbreaking model to data for a given value of d

Usage

1
2
fit.stick.model.given.d(geno.matrix, fit.matrix, d.here, wts = c(2, 1),
  run.regression)

Arguments

geno.matrix

Genotype matrix generated in generate.geno.matrix or read in

fit.matrix

Fitness matrix generated in sim.stick.data or read in

d.here

The value of d estimates are based on

wts

Vector of weights to weight genotypes by. Used when generate.geno.weight.matrix is called (see that function). Default is c(2,1), meaning weight single-mutation genotypes twice as heavily as others. Alternatively, vector of weights corresponding to geno.matrix can be provided.

run.regression

TRUE/FALSE Run regression analysis when fitting model. See details.

Details

Note that the coefficient estimates are obtained by weighting. The default is to give wild type to single mutation genotypes twice the weight as all other comparisons based on the assumption that wild type is know with much lower error than the other genotypes. Alternatively, a vector of weights can be used with length the same as the number of genotypes in geno.matrix.

In addition to R-squared we assess model fit by doing linear regression of background fitness against effect. When the model generating data and analyzing data are the same, the expected slope is zero and the p-values are uniform(0,1). The results from those regressions are returned in regression.results.
run.regression If you are doing simulations to assess parameter estimation only, you don't need to run regression. If you are using this function to generate data for model fitting, then this should be set to TRUE. @examples n.muts <- length(Khan.data[1,])-1 geno.matrix <- Khan.data[,seq(1, n.muts)] fit.matrix <- as.matrix(Khan.data[,(n.muts+1)]) d.hat.MLE <- estimate.d.MLE(geno.matrix, fit.matrix,c(0.1, 10),0.001,c(2,1)) d.hat.RDB <- estimate.d.RDB(geno.matrix, fit.matrix,-100)$d.hat.RDB d.hat.seq <- estimate.d.sequential(geno.matrix, fit.matrix, d.hat.MLE, d.hat.RDB, c(0.1, 10), 1.1) fit.stick.model.given.d(geno.matrix, fit.matrix, d.hat.seq, run.regression=TRUE)

Value

List:
[[1]] u.hats are the estimated stickbreaking coefficients
[[2]] R2 is proportion of fitness variation explained by model. Does not include wild type in calculation.
[[3]] sig.hat is estimate of sigma
[[4]] logLike is log-likelihood of the data under the fitted model.
[[5]] regression.results List of results when regressing effects of mutations against the background fitness of mutations (see details). [[1]] p.vals gives p-value of each mutation, [[2]] lm.intercepts gives estimated intercept for mutation, [[3]] lm.slopes gives slope for each mutation, [[4]] P is the sum of the log of p-values. This is the summary statistic. [[5]] fitness.of.backs Matrix with fitness of backgrounds when each mutation (columns) is added to each genotype (rows). [[6]] effects.matrix Matrix with fitness effect when given mutation (column) is added to given create genotype (row).


Stickbreaker documentation built on May 29, 2017, 9:01 a.m.