happi: Main function for happi, p=q=1; this script contains the...

View source: R/run_happi.R

happiR Documentation

Main function for happi, p=q=1; this script contains the modularized version of happi with correct implementation of log likelihood

Description

Main function for happi, p=q=1; this script contains the modularized version of happi with correct implementation of log likelihood

Usage

happi(
  outcome,
  covariate = NULL,
  h0_param = 2,
  quality_var = NULL,
  covariate_formula = NULL,
  covariate_formula_h0 = NULL,
  quality_var_formula = NULL,
  data = NULL,
  max_iterations = 1000,
  min_iterations = 15,
  change_threshold = 0.05,
  epsilon = 0,
  method = "splines",
  random_starts = FALSE,
  firth = TRUE,
  spline_df = 3,
  nstarts = 1,
  seed = 13,
  norm_sd = 1,
  run_npLRT = FALSE,
  P = NULL,
  verbose = TRUE
)

Arguments

outcome

length-n vector; this is the vector of a target gene's presence/absence; should be coded as 0 or 1

covariate

n x p matrix; this is the matrix for the primary predictor/covariate of interest

h0_param

the column index in covariate that has beta=zero under the null

quality_var

length-n vector; this is the quality variable vector, currently p = 1 TODO(turn into n x q matrix)

covariate_formula

alternative to covariate argument, a formula for covariates of the form ~ covariate1 + covariate2 + ..., requires data argument

covariate_formula_h0

alternative to h0_param argument, a formula for covariates in the null model, takes the form ~ 1 for an intercept-only model, requires data argument

quality_var_formula

alternative to quality_var argument, a formula for quality variable of the form ~ quality_var, requires data argument

data

required with formula arguments, a data frame including covariates and the quality variable

max_iterations

the maximum number of EM steps that the algorithm will run for

min_iterations

the minimum number of EM steps that the algorithm will run for

change_threshold

algorithm will terminate early if the likelihood changes by this percentage or less for 5 iterations in a row for both the alternative and the null

epsilon

probability of observing a gene when it should be absent; probability between 0 and 1; default is 0. Either a single value or a vector of length n.

method

method for estimating f. Defaults to "splines" which fits a monotone spline with df determined by argument spline_df; "isotone" for isotonic regression fit

random_starts

whether to pick the starting values of beta's randomly. Defaults to FALSE.

firth

use firth penalty? Default is TRUE.

spline_df

degrees of freedom (in addition to intercept) to use in monotone spline fit; default 3

nstarts

number of starts; Integer. Defaults to 1. Number of starts for optimization.

seed

numeric number to set seed for random multiple starts

norm_sd

positive number to set as the standard deviation for the Normal distribution used to draw initial parameter values from.

run_npLRT

logical, if TRUE, non-parametric permutation LRT test will also be run.

P

if run_npLRT is TRUE, number of permutations to run

verbose

TRUE to return all information generated by happi, FALSE to only return effect size and p-value

Value

An object of class happi.

Examples

data(TM7_data)
x_matrix <- model.matrix(~tongue, data = TM7_data)
happi_results <- happi (outcome = TM7_data$`Cellulase/cellobiase CelA1`,
covariate=x_matrix, 
quality_var=TM7_data$mean_coverage,
max_iterations=1000, 
change_threshold=0.1,
epsilon=0, 
nstarts = 1, 
spline_df = 3)

statdivlab/happi documentation built on April 19, 2024, 2:04 a.m.