sigTest: Significance tests

sigTestR Documentation

Significance tests

Description

Significance tests

Usage

sigTest(
  df,
  id,
  label_scheme_sub,
  scale_log2r,
  complete_cases,
  impute_na,
  rm_allna,
  method_replace_na,
  filepath,
  filename,
  method,
  padj_method,
  var_cutoff,
  pval_cutoff,
  logFC_cutoff,
  data_type,
  anal_type,
  ...
)

Arguments

df

The name of a primary data file. By default, it will be determined automatically after matching the types of data and analysis with an id among c("pep_seq", "pep_seq_mod", "prot_acc", "gene"). A primary file contains normalized peptide or protein data and is among c("Peptide.txt", "Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt"). For analyses require the fields of significance p-values, the df will be one of c("Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt").

id

Character string; one of pep_seq, pep_seq_mod, prot_acc and gene.

label_scheme_sub

A data frame. Subset entries from label_scheme for selected samples.

scale_log2r

Logical; if TRUE, adjusts log2FC to the same scale of standard deviation across all samples. The default is TRUE. At scale_log2r = NA, the raw log2FC without normalization will be used.

complete_cases

Logical; if TRUE, only cases that are complete with no missing values will be used. The default is FALSE.

impute_na

Logical; if TRUE, data with the imputation of missing values will be used. The default is FALSE.

rm_allna

Logical; if TRUE, removes data rows that are exclusively NA across ratio columns of log2_R126 etc. The setting also applies to log2_R000 in LFQ.

method_replace_na

The method to replace NA values by rows. The default is none by doing nothing. At method_replace_na = min, the row minimums will be used. The argument is only a device to assess pVals, e.g., by handling the circumstance of all NA values under one group and non-trivial values under another. The setting of min might be useful at the experimenters' discretion of ascribing NA values to the lack of signals. The argument has no effects with impute_na = TRUE.

filepath

A file path to output results. By default, it will be determined automatically by the name of the calling function and the value of id in the call.

filename

A representative file name to outputs. By default, the name(s) will be determined automatically. For text files, a typical file extension is .txt. For image files, they are typically saved via ggsave or pheatmap where the image type will be determined by the extension of the file name.

method

Dummy argument to avoid incurring the corresponding argument in dist by partial argument matches.

padj_method

Character string; the method of multiple-test corrections for uses with p.adjust. The default is "BH". See ?p.adjust.methods for additional choices.

var_cutoff

Numeric; the cut-off in the variances of protein log2FC. Entries with variances smaller than the threshold will be removed from GSVA. The default is 0.5.

pval_cutoff

Numeric value or vector; the cut-off in protein significance pVal. Entries with pVals less significant than the threshold will be excluded from enrichment analysis. The default is 0.05 for all formulas matched to or specified in argument fml_nms. Formula-specific threshold is allowed by supplying a vector of cut-off values.

logFC_cutoff

Numeric value or vector; the cut-off in protein log2FC. Entries with absolute log2FC smaller than the threshold will be excluded from enrichment analysis. The default magnitude is log2(1.2) for all formulas matched to or specified in argument fml_nms. Formula-specific threshold is allowed by supplying a vector of absolute values in log2FC.

data_type

The type of data being either Peptide or Protein.

anal_type

Character string; the type of analysis that are preset for method dispatch in function factories. The value will be determined automatically. Exemplary values include anal_type = c("PCA", "Corrplot", "EucDist", "GSPA", "Heatmap", "Histogram", "MDS", "Model", "NMF", "Purge", "Trend", "LDA", ...).

...

filter_: Variable argument statements for the row filtration of data against the column keys in Peptide.txt for peptides or Protein.txt for proteins. Each statement contains to a list of logical expression(s). The lhs needs to start with filter_. The logical condition(s) at the rhs needs to be enclosed in exprs with round parenthesis.

For example, pep_len is a column key in Peptide.txt. The statement filter_peps_at = exprs(pep_len <= 50) will remove peptide entries with pep_len > 50. See also normPSM.

Additional parameters for plotting with ggplot2:
xmin, the minimum x at a log2 scale; the default is -2.
xmax, the maximum x at a log2 scale; the default is +2.
xbreaks, the breaks in x-axis at a log2 scale; the default is 1.
binwidth, the binwidth of log2FC; the default is (xmax - xmin)/80.
ncol, the number of columns; the default is 1.
width, the width of plot;
height, the height of plot.
scales, should the scales be fixed across panels; the default is "fixed" and the alternative is "free".


qzhang503/proteoQ documentation built on March 16, 2024, 5:27 a.m.