CompDTUReg: CompDTUReg: Fit Compositional Regression Models for DTU

prepareData

R Documentation

Prepare raw adbundance/count/length data for later analysis

Description

prepareData generates abundance and count data to be used later, notably in generateData

Usage

prepareData(
  abundance,
  counts,
  lengths,
  tx2gene,
  nsamp,
  key = NULL,
  infReps = "none",
  samps = NULL
)

Arguments

`abundance`	is a dataframe with nsamp+1 columns, with names Sample1, Sample2, etc and a column for tx_id (that often comes from the rownames). Rows are transcript level quantification estimates. Column names should not include "TPM".
`counts`	is a dataframe with nsamp+1 columns, with names Sample1, Sample2, etc and a column for tx_id (that often comes from the rownames). Rows are transcript level quantification estimates. Column names should not include "Cnt".
`lengths`	is a dataframe with nsamp+1 columns, with names Sample1, Sample2, etc and a column for tx_id (that often comes from the rownames). Rows are transcript level effective length information. Column names should not include "Length".
`tx2gene`	is a dataframe that matches transcripts to genes. Can be created by `maketx2gene`.
`nsamp`	is the number of biological samples/replicates used in the analysis
`key`	is a data.frame with columns "Sample" (corresponding to the unique biological identifier for the analysis), "Condition" (giving the condition/treatment effect variables for the data), and "Identifier", which should be named "Sample1", "Sample2", ... up to the number of rows of key. This "Identifier" needs to be created like this even if the observations don't correspond to unique biological samples.
`infReps`	is a character variable indicating what kind of inferential replicates (if any) are to be analyzed by the current function call. Values to be used should be "none", "Boot", and "Gibbs". Default is "none".
`samps`	is an optional vector containing the sample names. Need to specify this if sample names are not just paste0("Sample", 1:nsamp) without any missing.

Value

list of length 2 with the first element being the abundance data (abGeneTempF) and the second being the count data (cntGeneTempF) for use with generateData

skvanburen/CompDTUReg documentation built on Jan. 23, 2025, 9:01 a.m.