compare_pangenome_covariates: compare_pangenome_covariates
In gtonkinhill/panstripe: Pangenome Plots

View source: R/compare_pangenome_covariates.R

compare_pangenome_covariates

R Documentation

compare_pangenome_covariates

Description

Tests whether covariates associated with different pangenomes are significantly associated with gene gain and loss or errors.

Usage

compare_pangenome_covariates(
  fits,
  covariates,
  family = "Tweedie",
  keep = "all",
  modeldisp = FALSE,
  ci_type = "norm",
  conf = 0.95,
  nboot = 100
)

Arguments

`fits`	a list of 'panfit' objects generated by the 'panstripe' function
`covariates`	a 'data.frame' object generated where the first column matches the names in the list of pangenomes. Covariates to be tested are given in subsequent columns.
`family`	the family used by glm. One of 'Tweedie', 'Poisson', 'Gamma' or 'Gaussian'. (default='Tweedie')
`keep`	a vector of a subset of the column names in the covariate data.frame. (default='all')
`modeldisp`	whether or not to model the dispersion as a function of the covariates of interest if using a Tweedie family (default=FALSE)
`ci_type`	the method used to calculate the bootstrap CI (default='bca'). See boot.ci for more details.
`conf`	A scalar indicating the confidence level of the required intervals (default=0.95)
`nboot`	the number of bootstrap replicates to perform (default=100)

Value

a list containing a summary of the comparison and the resulting 'glm' model object

Examples


simA <- simulate_pan(rate=1e-4, ngenomes = 50, fn_error_rate=1, fp_error_rate=1)
simB <- simulate_pan(rate=1e-3, ngenomes = 200, fn_error_rate=1, fp_error_rate=1)
simC <- simulate_pan(rate=5e-3, ngenomes = 100, fn_error_rate=1, fp_error_rate=1)
tfits <- purrr::map(list(A=simA, B=simB, C=simC), ~{
  panstripe(.x$pa, .x$tree, nboot=10, ci_type='perc')
})

covariates <- tibble::tibble(
  pangenome=c('A','B','C','E','F','G'),
  dummy=c(1,2,3,1,2,2)
)
fits <- c(tfits, list(E=tfits[[1]], F=tfits[[2]], G=tfits[[3]]))
comp <- compare_pangenome_covariates(fits, covariates, modeldisp=TRUE)
comp$summary

gtonkinhill/panstripe documentation built on Feb. 27, 2025, 9:01 p.m.