estimate_lineages: Estimate Lineage Proportions In Samples
In MixviR: Analysis and Exploration of Mixed Microbial Genomic Samples

View source: R/estimate_lineages.R

estimate_lineages

R Documentation

Estimate Lineage Proportions In Samples

Description

Create summary tables containing data on lineages identified in samples, including estimates of relative proportions of lineages and identities of associated characteristic mutations.

Usage

estimate_lineages(
  muts.df,
  min.alt.freq = 0.01,
  dates = NULL,
  lineage.muts = NULL,
  read.muts.from = NULL,
  scale = TRUE,
  use.median = FALSE,
  outfile.name = NULL,
  presence.thresh = 0.5,
  samps.to.inc = NULL,
  locs.to.inc = NULL,
  lineages.to.inc = NULL,
  report.all = FALSE,
  depths.from = "all"
)

Arguments

`muts.df`	A data frame (produced by `call_mutations()`) storing mutation information for samples to analyze. Must contain columns SAMP_NAME, CHR, POS, GENE, ALT_ID, AF, & DP. Alternatively, the mutation data can be read in from a (comma-separated) file with the `read.muts.from()` argument. See the write.mut.table argument in `call_mutations()`.
`min.alt.freq`	Minimum frequency (0-1) for mutation to be counted. Default = 0.01.
`dates`	Path to optional csv file with cols "SAMP_NAME", "LOCATION", and "DATE". Sample names need to match those in samp_mutations data frame created by `call_mutations()`. Dates should be provided in the format mmddyyyy.
`lineage.muts`	(Required) Path to csv file with cols "Gene", "Mutation", and "Lineage" defining mutations associated with lineages of interest. See example file at "https://github.com/mikesovic/MixviR/blob/main/mutation_files/outbreak_20220217.csv". Additional columns will be ignored.
`read.muts.from`	An alternative to muts.df for providing input. If a data frame generated by `call_mutations()` was previously written to a (comma-separated) file (see write.mut.table in `call_mutations()`), the mutation data can be read in from that file by providing its path.
`scale`	Logical to indicate whether estimated proportions of lineages within a sample should be scaled down to sum to 1 if the sum of the initial estimates is > 1. Default = TRUE.
`use.median`	Logical to define the metric used to estimate frequencies of lineages in samples. Default = FALSE (mean is used).
`outfile.name`	If writing output to file, a character string giving the name/path of the file (csv) to be written.
`presence.thresh`	Numeric (0-1) defining a proportion of characteristic mutations that must be present in the sample for a lineage to be considered present. This threshold is applied if report.all = FALSE (the default).
`samps.to.inc`	Character vector of one or more sample names to include. If NULL (default), all samples are included.
`locs.to.inc`	Character vector of one or more locations to include. If NULL (default), all locations are included. Applies only if a dates file is provided, and these locations must match those in the 'LOCATION' column of that file.
`lineages.to.inc`	Character vector of one or more lineages to test for and report in results. If NULL (default), all lineages listed in the lineage.muts file are evaluated and reported.
`report.all`	Logical indicating whether to report results for all lineages (TRUE), or just those with a proportion of mutations present that exceeds presence.thresh. Default FALSE.
`depths.from`	Character, one of "all" (default) or "characteristic". If "all", average sequencing depths are calculated based on all mutations in a sample. If "characteristic", mean depths are calculated from the set of mutations that occur in only one analyzed lineage (mutations shared by two or more lineages are filtered out prior to calculating depths).

Value

Data frame containing estimates of proportions of each lineage in the sample.

Examples

estimate_lineages(lineage.muts = system.file("extdata", 
                                             "example_lineage_muts.csv", 
                                             package = "MixviR"), 
                  read.muts.from = system.file("extdata", 
                                               "sample_mutations.csv", 
                                               package = "MixviR"))

MixviR documentation built on Oct. 23, 2022, 1:09 a.m.