compute_collection_table: Get collection table normalized in wide format
In Miswi/RiboCrypt: Interactive visualization in genomics

compute_collection_table

R Documentation

Get collection table normalized in wide format

Description

Get collection table normalized in wide format

Usage

compute_collection_table(
  path,
  lib_sizes,
  df,
  metadata_field,
  normalization,
  kmer,
  metadata,
  min_count = 0,
  format = "wide",
  value.var = "logscore",
  as_list = FALSE,
  subset = NULL,
  group_on_tx_tpm = NULL,
  split_by_frame = FALSE,
  ratio_interval = NULL,
  decreasing_order = FALSE
)

Arguments

`path`	the path to gene counts
`lib_sizes`	named integer vector, default NULL. If given will do a pre tpm normalization for full library sizes
`df`	the ORFik experiment to load the precomputed collection from. It must also have defined runIDs() for all samples.
`metadata_field`	the column name in metadata, to select to group on.
`normalization`	a character string, which mode, for options see RiboCrypt:::normalizations
`kmer`	integer, default 1L (off), if > 1 will smooth out signal with sliding window size kmer.
`metadata`	a data.table of metadata, must contain the Run column to select libraries.
`min_count`	integer, default 0. Minimum counts of coverage over transcript to be included.
`format`	character, default "wide", alternative "long". The format of the table output.
`value.var`	which column to use as scores, default "logscore"
`as_list`	logical, default FALSE. Return as list of size 2, count data.table and metadata data.table Set to TRUE if you need metadata subset (needed if you subset the table, to get correct matching)
`subset`	numeric vector, positional interval to subset, must be <= size of whole region.
`group_on_tx_tpm`	numeric vector, default NULL. tpm values per libraries. Either for that gene or some other gene.
`split_by_frame`	logical, default FALSE For kmer sliding window, should it split by frame
`ratio_interval`	numeric vector of size 2 or 4, default NULL. If 2, means you should sort libraries on coverage in that region. If 4, means to sort on ratio of that region in this gene vs the other region in another gene.
`decreasing_order`	logical, default FALSE. Sort you ordering vector from lowest (default). If TRUE, sort from highest downwards.