analyzeConditionalAssociations: Stepwise conditional association analysis

View source: R/stats.R

analyzeConditionalAssociationsR Documentation

Stepwise conditional association analysis

Description

analyzeConditionalAssociations perform stepwise conditional testing adding the previous top-associated variable as covariate, until there are no more significant variables based on a self-defined threshold.

Usage

analyzeConditionalAssociations(
  object,
  variables,
  placeholder = "term",
  correction = "bonferroni",
  n_correction = NULL,
  th,
  th_adj = TRUE,
  keep = FALSE,
  rss_th = 1e-07,
  exponentiate = FALSE
)

Arguments

object

An existing fit from a model function such as lm, glm and many others.

variables

Character vector specifying variables to use in association tests.

placeholder

String specifying term to substitute with value from x. Ignored if set to NULL.

correction

String specifying multiple testing correction method. See details for further information.

n_correction

Integer specifying number of comparisons to consider during multiple testing correction calculations. For Bonferroni correction it is possible to specify a number lower than the number of comparisons being made. This is useful in cases when knowledge about the biology or redundance of alleles reduces the need for correction. For other methods it must be at least equal to the number of comparisons being made; only set this (to non-default) when you know what you are doing!

th

Number specifying threshold for a variable to be considered significant.

th_adj

Logical flag indicating if adjusted p-value should be used as threshold criteria, otherwise unadjusted p-value is used.

keep

Logical flag indicating if the output should be a list of results resulting from each selection step. Default is to return only the final result.

rss_th

Number specifying residual sum of squares threshold at which function should stop adding additional variables. As the residual sum of squares approaches 0 the perfect fit is obtained making further attempts at variable selection nonsense. This behavior can be controlled using rss_th.

exponentiate

Logical flag indicating whether or not to exponentiate the coefficient estimates. Internally this is passed to tidy. This is typical for logistic and multinomial regressions, but a bad idea if there is no log or logit link. Defaults to FALSE.

Value

Tibble with stepwise conditional testing results or a list of tibbles, see keep argument. The first column "term" hold the names of variables. Further columns depends on the used model and are determined by associated tidy function. Generally they will include "estimate", "std.error", "statistic", "p.value", "conf.low", "conf.high", "p.adjusted".

Examples

midas <- prepareMiDAS(hla_calls = MiDAS_tut_HLA,
                      colData = MiDAS_tut_pheno,
                      experiment = "hla_alleles")

# analyzeConditionalAssociations expects model data to be a data.frame
midas_data <- as.data.frame(midas)

# define base model
object <- lm(disease ~ term, data = midas_data)
analyzeConditionalAssociations(object,
                            variables = c("B*14:02", "DRB1*11:01"),
                            th = 0.05)


Genentech/midasHLA documentation built on Feb. 12, 2024, 9:38 a.m.