qfa.epi: Finds genetic interaction strengths and p-values

View source: R/EpiStasisFunctions.R

qfa.epiR Documentation

Finds genetic interaction strengths and p-values

Description

This function is from the original QFA v1 release and was not tested by the authors of the re-release of the bacterial adaptation v2. Use with caution. Fits a genetic independence model between control strains and double mutant strains, either using rjags and a Bayesian linear regression model, or lm and maximum likelihood. For each ORF, the probability that it is a false discovery of a suppressor or enhancer is calculated. These probabilities are then fdr corrected and returned along with genetic interaction scores.

Usage

qfa.epi(
  double,
  control,
  fdef = "fit",
  qthresh = 0.05,
  orfdict = "ORF2GENE.txt",
  GISthresh = 0,
  plot = TRUE,
  modcheck = FALSE,
  wctest = TRUE,
  bootstrap = NULL,
  Nboot = 5000,
  subSamp = Inf,
  reg = "lmreg"
)

Arguments

double

Either a qfa.posterior or the results of qfa.fit for the double mutants

control

Either a qfa.posterior or the results of qfa.fit for the control strains

fdef

String specifying what fitness definition to use. Must be the name of a column common to double and control. Typical options include: "nAUC", "r", "MDRMDP". The default "fit" is included for backwards compatibility with earlier versions of this function which relied on users manually creating a "fit" column that includes their required fitness definition values. This was usually achieved by copying an existing column (e.g. "MDRMDP").

qthresh

The FDR corrected cut off

orfdict

Location of file giving a column of ORFs first and a column of corresponding gene names second - so gene names can be plotted

GISthresh

When returning interaction hitlists, this variable determines the cutoff for strength of genetic interaction.

plot

If TRUE, then a 2-way fitness plot is made.

modcheck

If TRUE then diagnostic residual plots are output to “ModelCheck.pdf”

wctest

If TRUE, then use the Wilcoxon test for differences in medians as a measure of statistical significance of genetic interaction. This is the default. If FALSE, then use a t-test for difference in mean fitnesses instead.

bootstrap

If TRUE, then use bootstrapping procedure to check if genetic interactions are significant. If false, then use linear regression and t-test or wilcoxon test.

Nboot

Number of bootstrap samples to generate if using bootstrapping procedure

subSamp

Number of subsamples of available replicates to sample when bootstrapping (default, Inf, uses all available replicates, i.e. each summary (each bootstrap sample) is based on sampling subSamp from N with replacement. If subSamp==Inf, then subSamp is set equal to N.

reg

String specifying what type of regression to use. Default is least squares regression as implemented in lm function: "lmreg". Alternatives include "quantreg", "splitreg" and "perpreg". See lm.epi function help for further details.

Value

Returns an R list containing three data frames: Results, Enhancers and Suppressors. Each data frame has the following columns:

  • ORF - Unique strain genotype identifier (e.g. Y-number for yeast strains)

  • Gene - Human readable genotype identifier

  • P - p-value for significance of difference between control and query strain fitnesses

  • Q - q-value for significance of difference between control and query strain fitnesses. This is FDR corrected p-value

  • GIS - Genetic interaction strength. Deviation of (mean or median, depending on value of wctest) observed query strain fitness from expected fitness given control query strain fitness and a multiplicative model of genetic interaction.

  • QueryFitnessSummary - Summary statistic for all available replicate observations of query strain fitness (mean or median, depending on value of wctest).

  • ControlFitnessSummary - Summary statistic for all available replicate observations of control strain fitness (mean or median, depending on value of wctest).

  • QuerySE - Standard error on mean of query strain fitness observations

  • ControlSE - Standard error on mean of control strain fitness observations

  • TestType - Type of statistical test for significant difference carried out (i.e. Wilcoxon or t-test)

  • SummaryType - Type of summary statistic used for fitnesses (i.e. mean or median)

  • cTreat - Treatment applied to control plates

  • cMed - Medium added to agar in control plates

  • cBack - Control plate background tag (experiment identifier)

  • qTreat - Treatment applied to query plates

  • qMed - Medium added to agar in query plates

  • qBack - Query plate background tag (experiment identifier)

  • Type - Type of genetic interaction observed (suppressor, enhancer, positive, negative). This is assigned for strains with abs(GIS)>GISthresh and by comparing q-value with qthresh.


JulBaer/baQFA documentation built on Feb. 19, 2023, 10:32 p.m.