fish4rodents: Import abundances directly from kallisto output.

View source: R/input.R

fish4rodentsR Documentation

Import abundances directly from kallisto output.

Description

Apply TPM normalisation using the info available from the abundance.h5 files.

Usage

fish4rodents(
  A_paths,
  B_paths,
  annot,
  TARGET_COL = "target_id",
  PARENT_COL = "parent_id",
  beartext = FALSE,
  threads = 1L,
  scaleto = 1e+06
)

Arguments

A_paths

(character) A vector of strings, listing the directory paths to the quantifications for the first condition. One directory per replicate, without trailing path dividers. The directory name should be a unique identifier for the sample.

B_paths

(character) A vector of strings, listing the directory paths to the quantifications for the second condition. One directory per replicate, without trailing path dividers.. The directory name should be a unique identifier for the sample.

annot

(data.frame) A table matching transcript identifiers to gene identifiers. This should be the same that you used for quantification and that you will use with call_DTU(). It is used to order the transcripts consistently throughout RATs.

TARGET_COL

The name of the column for the transcript identifiers in annot. (Default "target_id")

PARENT_COL

The name of the column for the gene identifiers in annot. (Default "parent_id")

beartext

(logical) Instead of importing bootstrap data from the abundance.h5 file of each sample, import it from plaintext files in a bootstrap subdirectory created by running kallisto's h5dump subcommand (Default FALSE). This workaround circumvents some mysterious .h5 parsing issues on certain systems.

threads

(integer) For parallel processing. (Default 1)

scaleto

(double) Scaling factor for normalised abundances. (Default 1000000 gives TPM). If a numeric vector is supplied instead, its length must match the total number of samples. The value order should correspond to the samples in group A followed by group B. This allows each sample to be scaled to its own actual library size, allowing higher-throughput samples to carry more weight in deciding DTU.

Details

Converting, normalising and importing multiple bootstrapped abundance files takes a bit of time. IMPORTANT: This function is currently not intended to be used to import non-bootstrapped quantifications.

Value

A list of two, representing the TPM abundances per condition. These will be formatted in the RATs generic data input format, preferably for bootstrapped estimates (if bootstraps are available) or otherwise for plain count estimates.


bartongroup/RATS documentation built on June 8, 2022, 12:40 a.m.