process_wide_format: Merging rows with identical values in a particular column in...

View source: R/iq.R

process_wide_formatR Documentation

Merging rows with identical values in a particular column in a table

Description

Collapses rows with identical values in a particular column in a table. When the values in each row are proportional such as intensities of multiple fragments of a protein, the MaxLFQ algorithm is recommended.

Usage

process_wide_format(input_filename,
                    output_filename,
                    id_column,
                    quant_columns,
                    data_in_log_space = FALSE,
                    annotation_columns = NULL,
                    method = "maxLFQ")

Arguments

input_filename

Input filename of a tab-separated value text file.

output_filename

Output filename.

id_column

The column where unique values will be kept. Rows with identical values in this column are merged. Rows with empty values here are removed.

quant_columns

Columns containing numerical data to be merged.

data_in_log_space

A logical value. If FALSE, the numerical data will be log2-transformed.

annotation_columns

Columns in the input file apart from id_column and quant_columns that will be kept in the output.

method

Method for merging. Default value is "maxLFQ". Possible values are "maxLFQ", "maxLFQ_R", "median_polish", "top3", "top5", "meanInt", "maxInt", "sum", "least_na" and any function for collapsing a numerical matrix to a row vector.

Details

Method "maxLFQ_R" implements the MaxLFQ algorithm pure R. It is slower than "maxLFQ".

Method "maxInt" selects row with maximum intensity (top 1).

Method "sum" sum all intensities.

Method "least_na" selects row with the least number of missing values.

The value of method can be a function such as function(x) log2(colSums(2^x, na.rm = TRUE)) for summing all intensities in the original space.

Value

The result table is written to output_filename. A NULL value is returned.

Author(s)

Thang V. Pham

References

Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.


iq documentation built on March 31, 2023, 11:34 p.m.