View source: R/testDifferentialAbundance.R
testDifferentialAbundance | R Documentation |
Test Differential Protein Expression in MS proteomics data starting small: From the precursor level.
testDifferentialAbundance( input_dt = "path/to/DIANN_matrix.tsv", protein_group_annotation = NULL, study_design = "path/to/Study_design_filled.tsv", normalize_data = TRUE, normalization_function = limma::normalizeQuantiles, condition_1 = unique(fread(study_design)$condition)[2], condition_2 = unique(fread(study_design)$condition)[1], min_n_obs = 4, imp_percentile = 0.001, imp_sd = 0.2, plot_pdf = TRUE, write_tsv_tables = TRUE, target_protein = "O08760" )
input_dt |
Input data table either in tsv/txt format or already in R as data.table or data.frame with the following columns: #' The table should have the following columns:
Note: The data will be log2-transformed internally. |
protein_group_annotation |
Protein annotation table with columns Protein.Group and Protein.Names (and others if desired) that will be used to annotate the results. By default it is assumed to be a subset of and and an attempt will be made to extract it from the input_dt. |
study_design |
Study design in tab-separated .txt with mandatory columns:
|
normalize_data |
Whether or not data is scaled/normalized before differential testing. In some cases it might be preferable not to scale the datasets, e.g. when comparing pulldowns vs. input samples! Defaults to TRUE. |
normalization_function |
Normalization function to use that transforms a matrix of quantities where columns are samples and rows are analytes. Defaults to limma:normalizeQuantiles, but can be replaced with any such function. You may want to try limma::normalizeVSN or limma::normalizeMedianValues. |
condition_1 |
Manual override to the condition 1 for the differential comparison. By default it is guessed from unique(study_design$condition) |
condition_2 |
Manual override to the condition 2 for the differential comparison. By default it is guessed from unique(study_design$condition) |
min_n_obs |
Minimum number of observations per precursor (number of runs it was identified in) in order to keep in in the analysis |
imp_percentile |
Percentile of the total distribution of values on which the random distribution for sampling will be centered |
imp_sd |
standard deviation of the normal distribution from which values are sampled to impute missing values |
plot_pdf |
Document processing steps in a string of pdf graphs |
write_tsv_tables |
Write out final quant table with differential expression testing results |
target_protein |
Optional string with protein identifier to highlight in volcano plots |
A diffExpr object (list) containing (access by x$ or by "x[[name]]")
data_source: input_dt path or input R object name
data_long: Data in long format
data_matrix_log2: Data, filtered and log2 transformed, in wide format matrix
data_matrix_log2_imp: Data, filtered, log2 transformed and with missing values imputed, in wide format matrix
study_design: study design table
annotation_col: column annotation
diffExpr_result_dt: Result table with intensities and differential expression testing results
candidates_condition1: Proteins that appear higher abundant in condition 1
candidates_condition2: Proteins that appear higher abundant in condition 2
Moritz Heusel
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.