fn_process | R Documentation |
fn_process
creates the data structures necessary to analyze nucleotide recoding RNA-seq data with the
MLE and Hybrid implementations in bakRFit
. The input to fn_process
must be an object of class
bakRFnData
.
fn_process(
obj,
totcut = 50,
totcut_all = 10,
Chase = FALSE,
FOI = c(),
concat = TRUE
)
obj |
An object of class bakRFnData |
totcut |
Numeric; Any transcripts with less than this number of sequencing reads in any replicate of all experimental conditions are filtered out |
totcut_all |
Numeric; Any transcripts with less than this number of sequencing reads in any sample are filtered out |
Chase |
Boolean; if TRUE, pulse-chase analysis strategy is implemented |
FOI |
Features of interest; character vector containing names of features to analyze. If |
concat |
Boolean; If TRUE, FOI is concatenated with output of reliableFeatures |
fn_process
first filters out features with less than totcut reads in any sample. It then
creates the necessary data structures for analysis with bakRFit
and some of the visualization
functions (namely plotMA
).
The 1st step executed by fn_process
is to find the names of features which are deemed "reliable". A reliable feature is one with
sufficient read coverage in every single sample (i.e., > totcut_all reads in all samples) and sufficient read coverage in at all replicates
of at least one experimental condition (i.e., > totcut reads in all replicates for one or more experimental conditions). This is done with a call to reliableFeatures
.
The 2nd step is to extract only reliableFeatures from the fns dataframe in the bakRFnData
object. During this process, a numerical
ID is given to each reliableFeature, with the numerical ID corresponding to their order when arranged using dplyr::arrange
.
The 3rd step is to prepare data structures that can be passed to fast_analysis
and TL_stan
(usually accessed via the
bakRFit
helper function).
returns list of objects that can be passed to TL_stan
and/or fast_analysis
. Those objects are:
Stan_data; list that can be passed to TL_stan
with Hybrid_Fit = TRUE. Consists of metadata as well as data that
Stan
will analyze. Data to be analyzed consists of equal length vectors. The contents of Stan_data are:
NE; Number of datapoints for 'Stan' to analyze (NE = Number of Elements)
NF; Number of features in dataset
TP; Numerical indicator of s4U feed (0 = no s4U feed, 1 = s4U fed)
FE; Numerical indicator of feature
num_mut; Number of U-to-C mutations observed in a particular set of reads
MT; Numerical indicator of experimental condition (Exp_ID from metadf)
nMT; Number of experimental conditions
R; Numerical indicator of replicate
nrep; Number of replicates (maximum across experimental conditions)
nrep_vect; Vector of number of replicates in each experimental condition
tl; Vector of label times for each experimental condition
Avg_Reads; Standardized log10(average read counts) for a particular feature in a particular condition, averaged over replicates
sdf; Dataframe that maps numerical feature ID to original feature name. Also has read depth information
sample_lookup; Lookup table relating MT and R to the original sample name
Fn_est; A data frame containing fraction new estimates for +s4U samples:
sample; Original sample name
XF; Original feature name
fn; Fraction new estimate
n; Number of reads
Feature_ID; Numerical ID for each feature
Replicate; Numerical ID for each replicate
Exp_ID; Numerical ID for each experimental condition
tl; s4U label time
logit_fn; logit of fraction new estimate
kdeg; degradation rate constant estimate
log_kdeg; log of degradation rate constant estimate
logit_fn_se; Uncertainty of logit(fraction new) estimate
log_kd_se; Uncertainty of log(kdeg) estimate
Count_Matrix; A matrix with read count information. Each column represents a sample and each row represents a feature. Each entry is the raw number of read counts mapping to a particular feature in a particular sample. Column names are the corresponding sample names and row names are the corresponding feature names.
Ctl_data; Identical content to Fn_est but for any -s4U data (and thus with fn estimates set to 0). Will be NULL
if no -s4U
data is present
# Load cB
data("cB_small")
# Load metadf
data("metadf")
# Create bakRData
bakRData <- bakRData(cB_small, metadf)
# Preprocess data
data_for_bakR <- cBprocess(obj = bakRData)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.