analyzeLegacyTileseqCounts: analyze tileseq counts from legacy pipeline

View source: R/legacy.R

analyzeLegacyTileseqCountsR Documentation

analyze tileseq counts from legacy pipeline

Description

This analysis function performs the following steps for each mutagenesis region: 1. Construction of HGVS variant descriptor strings. 2. Collapsing equivalent codons into amino acic change counts. 3. Error regularization at the level of pre- and post-selection counts. 4. Quality-based filtering filtering based on "Song's rule". 5. Fitness score calculation and error propagation. 6. Secondary error regularization at the level of fitness scores. 7. Determination of synonymous and nonsense medians and re-scaling of fitness scores. 8. Flooring of negative scores and adjustment of associated error. 9. Output in MaveDB format.

Usage

analyzeLegacyTileseqCounts(
  countfile,
  regionfile,
  outdir,
  logger = NULL,
  inverseAssay = FALSE,
  pseudoObservations = 2,
  conservativeMode = TRUE
)

Arguments

countfile

the path to the "rawData.txt" file produced by the legacy pipeline.

regionfile

the path to a tab-delimited file describing the mutagenesis regions. Must contain columns 'region', start', 'end', 'syn', 'stop', i.e. the region id, the start position, end position, and and optional synonymous and stopm mean overrides.

outdir

path to desired output directory

logger

a yogilogger object to be used for logging (or NULL for simple printing)

inverseAssay

a boolean flag to indicate that the experiment was done with an inverse assay i.e. protein function leading to decreased fitness. Defaults to FALSE

pseudoObservations

The number of pseudoObservations to use for the Baldi&Long regularization. Defaults to 2.

conservativeMode

Boolean flag. When turned on, pseudoObservations are not counted towards standard error and the first round of regularization uses pessimistic error estimates.

Value

nothing. output is written to various files in the output directory


jweile/tileseqMave documentation built on April 5, 2024, 4:51 p.m.