prepare_fungiexpresz_fpkm_matrix_rds_files_from_stringTie_output: Prepare FPKM matrix from stringTie output

View source: R/prepare_fpkm_files_from_excel_data_functions.R

prepare_fungiexpresz_fpkm_matrix_rds_files_from_stringTie_outputR Documentation

Prepare FPKM matrix from stringTie output

Description

This is the helper function to prepare the FPKM data to include in the FungiExpresZ.

Usage

prepare_fungiexpresz_fpkm_matrix_rds_files_from_stringTie_output(
  data_dir,
  frac = 0.01
)

Arguments

data_dir

a character string denoting a valid directory path in which stringTie output and a file of mapping rate is stored. See details.

frac

a double, default 0.01, denoting the value to be added to the FPKM prior to log2.

Details

'data_dir' must contains two type of files -1) .xls file (must be suffixed with '_expression_values.xls') and 2) .csv file (must be suffixed with '_mapping_rate.csv'). Each .xls file is a sample wise default output of stringTie program. FPKM values will be derived from the 9th column of this file. .csv file contains mapping rate of samples from all .xls files. It requires atleast 3 columns with the column names 'study' (denotes the SRA study id), 'sample' (denotes the SRA id),and 'mapping_rate' (denotes mapping rate where 1 is eqal to 100

The return object will be the tibble of log2(FPKM + frac) values. Each column will be the SRA sample and each row will be the gene. The rows containing 0 in all the samples and columns containing 0 in all the rows will be discarded.

Column names (in most cases SRA id) will be derived from the file names. It is requirement that .xls files named in the format of '<SRAID>_expression_values.xls'.

Value

a tibble containing sample wise log2 (FPKM + frac) values.


cparsania/FungiExpresZ documentation built on March 15, 2024, 5:48 p.m.