gspaTest: Perform GSPA tests
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

gspaTest

R Documentation

Perform GSPA tests

Description

logFC_cutoff Numeric A threshold for the subset of data before the calculation of adjusted pvals

Usage

gspaTest(
  df = NULL,
  id = "entrez",
  label_scheme_sub = NULL,
  scale_log2r = TRUE,
  complete_cases = FALSE,
  impute_na = FALSE,
  filepath = NULL,
  filename = NULL,
  gset_nms = "go_sets",
  var_cutoff = 0.5,
  pval_cutoff = 0.05,
  logFC_cutoff = log2(1.2),
  gspval_cutoff = 0.05,
  gslogFC_cutoff = log2(1),
  min_size = 6,
  max_size = Inf,
  min_delta = 4,
  min_greedy_size = 1,
  use_adjP = FALSE,
  method = "mean",
  anal_type = "GSPA",
  ...
)

Arguments

`df`	The name of a primary data file. By default, it will be determined automatically after matching the types of data and analysis with an `id` among `c("pep_seq", "pep_seq_mod", "prot_acc", "gene")`. A primary file contains normalized peptide or protein data and is among `c("Peptide.txt", "Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt")`. For analyses require the fields of significance p-values, the `df` will be one of `c("Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt")`.
`id`	Currently only "entrez".
`label_scheme_sub`	A data frame. Subset entries from `label_scheme` for selected samples.
`scale_log2r`	Logical; if TRUE, adjusts `log2FC` to the same scale of standard deviation across all samples. The default is TRUE. At `scale_log2r = NA`, the raw `log2FC` without normalization will be used.
`complete_cases`	Logical; if TRUE, only cases that are complete with no missing values will be used. The default is FALSE.
`impute_na`	Logical; if TRUE, data with the imputation of missing values will be used. The default is FALSE.
`filepath`	A file path to output results. By default, it will be determined automatically by the name of the calling function and the value of `id` in the `call`.
`filename`	A representative file name to outputs. By default, the name(s) will be determined automatically. For text files, a typical file extension is `.txt`. For image files, they are typically saved via `ggsave` or `pheatmap` where the image type will be determined by the extension of the file name.
`gset_nms`	Character string or vector containing the shorthanded name(s), full file path(s), or both, to gene sets for enrichment analysis. For species among `"human", "mouse", "rat"`, the default of `c("go_sets", "c2_msig", "kinsub")` will utilize terms from gene ontology (`GO`), molecular signatures (`MSig`) and kinase-substrate network (`PSP Kinase-Substrate`). Custom `GO`, `MSig` and other data bases at given species are also supported. See also: `prepGO` for the preparation of custom `GO`; `prepMSig` for the preparation of custom `MSig`. For other custom data bases, follow the same format of list as `GO` or `MSig`.
`var_cutoff`	Numeric; the cut-off in the variances of protein log2FC. Entries with variances smaller than the threshold will be removed from GSVA. The default is 0.5.
`pval_cutoff`	Numeric value or vector; the cut-off in protein significance `pVal`. Entries with `pVals` less significant than the threshold will be excluded from enrichment analysis. The default is 0.05 for all formulas matched to or specified in argument `fml_nms`. Formula-specific threshold is allowed by supplying a vector of cut-off values.
`logFC_cutoff`	Numeric value or vector; the cut-off in protein `log2FC`. Entries with absolute `log2FC` smaller than the threshold will be excluded from enrichment analysis. The default magnitude is `log2(1.2)` for all formulas matched to or specified in argument `fml_nms`. Formula-specific threshold is allowed by supplying a vector of absolute values in `log2FC`.
`gspval_cutoff`	Numeric value or vector; the cut-off in gene-set significance `pVal`. Only enrichment terms with `pVals` more significant than the threshold will be reported. The default is 0.05 for all formulas matched to or specified in argument `fml_nms`. Formula-specific threshold is allowed by supplying a vector of cut-off values.
`gslogFC_cutoff`	Numeric value or vector; the cut-off in gene-set enrichment fold change. Only enrichment terms with absolute fold change greater than the threshold will be reported. The default magnitude is `log2(1.2)` for all formulas matched to or specified in argument `fml_nms`. Formula-specific threshold is allowed by supplying a vector of absolute values in `log2FC`.
`min_size`	Numeric value or vector; minimum number of protein entries for consideration in gene set tests. The number is after data filtration by `pval_cutoff`, `logFC_cutoff` or varargs expressions under `filter_`. The default is 10 for all formulas matched to or specified in argument `fml_nms`. Formula-specific threshold is allowed by supplying a vector of sizes.
`max_size`	Numeric value or vector; maximum number of protein entries for consideration in gene set tests. The number is after data filtration by `pval_cutoff`, `logFC_cutoff` or varargs expressions under `filter_`. The default in infinite for all formulas matched to or specified in argument `fml_nms`. Formula-specific threshold is allowed by supplying a vector of sizes.
`min_delta`	Numeric value or vector; the minimum count difference between the up- and the down-expressed group of proteins for consideration in gene set tests. For example at `min_delta = 4`, a gene set will 6 upregulated proteins and 2 down-expressed proteins, or vice versa, will be assessed. The number is after data filtration by `pval_cutoff`, `logFC_cutoff` or varargs expressions under `filter_`. The default is 4 for all formulas matched to or specified in argument `fml_nms`. Formula-specific threshold is allowed by supplying a vector of sizes.
`min_greedy_size`	Numeric value or vector; minimum number of unique protein entries for a gene set to be considered essential. The default in `1` for all formulas matched to or specified in argument `fml_nms`. Formula-specific threshold is allowed by supplying a vector of sizes.
`use_adjP`	Logical; if TRUE, use Benjamini-Hochberg pVals. The default is FALSE.
`method`	Dummy argument to avoid incurring the corresponding argument in dist by partial argument matches.
`anal_type`	Character string; the type of analysis that are preset for method dispatch in function factories. The value will be determined automatically. Exemplary values include `anal_type = c("PCA", "Corrplot", "EucDist", "GSPA", "Heatmap", "Histogram", "MDS", "Model", "NMF", "Purge", "Trend", "LDA", ...)`.
`...`	`filter_`: Variable argument statements for the row filtration of data against the column keys in `Peptide.txt` for peptides or `Protein.txt` for proteins. Each statement contains to a list of logical expression(s). The `lhs` needs to start with `filter_`. The logical condition(s) at the `rhs` needs to be enclosed in `exprs` with round parenthesis. For example, `pep_len` is a column key in `Peptide.txt`. The statement `filter_peps_at = exprs(pep_len <= 50)` will remove peptide entries with `pep_len > 50`. See also `normPSM`. Additional parameters for plotting with `ggplot2`: `xmin`, the minimum `x` at a log2 scale; the default is -2. `xmax`, the maximum `x` at a log2 scale; the default is +2. `xbreaks`, the breaks in `x`-axis at a log2 scale; the default is 1. `binwidth`, the binwidth of `log2FC`; the default is `(xmax - xmin)/80`. `ncol`, the number of columns; the default is 1. `width`, the width of plot; `height`, the height of plot. `scales`, should the scales be fixed across panels; the default is "fixed" and the alternative is "free".

qzhang503/proteoQ documentation built on April 13, 2025, 8:31 a.m.

qzhang503/proteoQ index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

gspaTest: Perform GSPA tests
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

Perform GSPA tests

Description

Usage

Arguments

Related to gspaTest in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ Processing and Informatic Analysis of Mass Spectrometrirc Data

gspaTest: Perform GSPA tests In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

Perform GSPA tests

Description

Usage

Arguments

Related to gspaTest in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

gspaTest: Perform GSPA tests
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data