preprocess | R Documentation |
Prepares a long-format input including removing low-intensity ions and performing median normalization.
preprocess(quant_table,
primary_id = "PG.ProteinGroups",
secondary_id = c("EG.ModifiedSequence", "FG.Charge", "F.FrgIon", "F.Charge"),
sample_id = "R.Condition",
intensity_col = "F.PeakArea",
median_normalization = TRUE,
log2_intensity_cutoff = 0,
pdf_out = "qc-plots.pdf",
pdf_width = 12,
pdf_height = 8,
intensity_col_sep = NULL,
intensity_col_id = NULL,
na_string = "0")
quant_table |
A long-format table with a primary column of protein identification, secondary columns of fragment ions, a column of sample names, and a column for quantitative intensities. |
primary_id |
Unique values in this column form the list of proteins to be quantified. |
secondary_id |
A concatenation of these columns determines the fragment ions used for quantification. |
sample_id |
Unique values in this column form the list of samples. |
intensity_col |
The column for intensities. |
median_normalization |
A logical value. The default |
log2_intensity_cutoff |
Entries lower than this value in log2 space are ignored. Plot a histogram of all intensities to set this parameter. |
pdf_out |
A character string specifying the name of the PDF output. A |
pdf_width |
Width of the pdf output in inches. |
pdf_height |
Height of the pdf output in inches. |
intensity_col_sep |
A separator character when entries in the intensity column contain multiple values. |
intensity_col_id |
The column for identities of multiple quantitative values. |
na_string |
The value considered as NA. |
When entries in the intensity column contain multiple values, this function will replicate entries in other column and the secondary_id
will be appended with corresponding entries in intensity_col_id
when it is provided. Otherwise, integer values 1, 2, 3, etc... will be used.
A data frame is returned with following components
protein_list |
A vector of proteins. |
sample_list |
A vector of samples. |
id |
A vector of fragment ions to be used for quantification. |
quant |
A vector of log2 intensities. |
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
data("spikeins")
head(spikeins)
# This example set of spike-in proteins has been 'median-normalized'.
norm_data <- iq::preprocess(spikeins, median_normalization = FALSE, pdf_out = NULL)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.