prnGSPAHM: Heat map visualization of GSPA results

prnGSPAHMR Documentation

Heat map visualization of GSPA results

Description

prnGSPAHM visualizes distance heat maps and networks between essential and all gene sets.

Usage

prnGSPAHM(
  scale_log2r = TRUE,
  complete_cases = FALSE,
  impute_na = FALSE,
  fml_nms = NULL,
  annot_cols = NULL,
  annot_colnames = NULL,
  annot_rows = NULL,
  df2 = NULL,
  filename = NULL,
  ...
)

Arguments

scale_log2r

Logical; at the TRUE default, input files with _Z[...].txt in name will be used. Otherwise, files with _N[...].txt in name will be taken. An error will be thrown if no files are matched under given conditions.

complete_cases

Logical; if TRUE, only cases that are complete with no missing values will be used. The default is FALSE.

impute_na

Logical; at TRUE, input files with _impNA[...].txt in name will be loaded. Otherwise, files without _impNA in name will be taken. An error will be thrown if no files are matched under given conditions. The default is FALSE.

fml_nms

Character string or vector; the formula name(s). By default, the formula(s) will match those used in pepSig or prnSig.

annot_cols

A character vector of column keys that can be found in _essmap.txt. The values under the selected keys will be used to color-code enrichment terms on the top of heat maps. The default is NULL without column annotation.

annot_colnames

A character vector of replacement name(s) to annot_cols. The default is NULL without name replacement.

annot_rows

A character vector of column keys that can be found from _essmeta.txt . The values under the selected keys will be used to color-code essential terms on the side of heat maps. The default is NULL without row annotation.

df2

Character vector or string; the name(s) of secondary data file(s). An informatic task, i.e. anal_prnTrend(...) against a primary df generates secondary files such as Protein_Trend_Z_nclust6.txt etc. See also prnHist for the description of a primary df; normPSM for the lists of df and df2.

filename

A representative file name to outputs. By default, it will be determined automatically by the name of the current call.

...

filter2_: Variable argument statements for the row filtration against data in secondary file(s) of _essmap.txt. Each statement contains to a list of logical expression(s). The lhs needs to start with filter2_. The logical condition(s) at the rhs needs to be enclosed in exprs with round parenthesis. For example, distance is a column key in Protein_GSPA_Z_essmap.txt. The statement filter2_ = exprs(distance <= .95), will remove entries with distance > 0.95. See also normPSM for the format of filter_ statements against primary data.

arrange2_: Variable argument statements for the row ordering against data in secondary file(s) of _essmap.txt. The lhs needs to start with arrange2_. The expression(s) at the rhs needs to be enclosed in exprs with round parenthesis. For example, distance and size are column keys in Protein_GSPA_Z_essmap.txt. The statement arrange2_ = exprs(distance, size), will order entries by distance, then by size. See also prnHM for the format of arrange_ statements against primary data.

Additional arguments for pheatmap, i.e., fontsize ...

Note arguments disabled from pheatmap:
annotation_col; instead use keys indicated in annot_cols
annotation_row; instead use keys indicated in annot_rows

Details

The list of gene sets and the associative quality metrics of size and ess_size are assessed after data filtration with the criteria specified by arguments pval_cutoff and logFC_cutoff, as well as optional varargs of filter_.

Protein_GSPA_[...].txt

Key Description
term a gene set term
is_essential a logical indicator of gene set essentiality
size the number of IDs under a term
ess_size the number of IDs that can be found under a corresponding essential set
contrast a contrast of sample groups
p_val significance p values
q_val p_val with BH adjustment of multiple tests
log2fc the fold change of a gene set at logarithmic base of 2

Protein_GSPA_[...]essmap.txt

Key Descrption
term a gene set term
ess_term an essential gene set term
size the number of IDs under a term with matches to an ess_term
ess_size the number of essential IDs under a term with matches to an ess_term
fraction a fraction of matches in IDs between a term and a ess_term
distance 1 - fraction
idx a numeric index of term
ess_idx a numeric index of ess_term

See Also

Metadata
load_expts for metadata preparation and a reduced working example in data normalization

Data normalization
normPSM for extended examples in PSM data normalization
PSM2Pep for extended examples in PSM to peptide summarization
mergePep for extended examples in peptide data merging
standPep for extended examples in peptide data normalization
Pep2Prn for extended examples in peptide to protein summarization
standPrn for extended examples in protein data normalization.
purgePSM and purgePep for extended examples in data purging
pepHist and prnHist for extended examples in histogram visualization.
extract_raws and extract_psm_raws for extracting MS file names

Variable arguments of 'filter_...'
contain_str, contain_chars_in, not_contain_str, not_contain_chars_in, start_with_str, end_with_str, start_with_chars_in and ends_with_chars_in for data subsetting by character strings

Missing values
pepImp and prnImp for missing value imputation

Informatics
pepSig and prnSig for significance tests
pepVol and prnVol for volcano plot visualization
prnGSPA for gene set enrichment analysis by protein significance pVals
gspaMap for mapping GSPA to volcano plot visualization
prnGSPAHM for heat map and network visualization of GSPA results
prnGSVA for gene set variance analysis
prnGSEA for data preparation for online GSEA.
pepMDS and prnMDS for MDS visualization
pepPCA and prnPCA for PCA visualization
pepLDA and prnLDA for LDA visualization
pepHM and prnHM for heat map visualization
pepCorr_logFC, prnCorr_logFC, pepCorr_logInt and prnCorr_logInt for correlation plots
anal_prnTrend and plot_prnTrend for trend analysis and visualization
anal_pepNMF, anal_prnNMF, plot_pepNMFCon, plot_prnNMFCon, plot_pepNMFCoef, plot_prnNMFCoef and plot_metaNMF for NMF analysis and visualization

Custom databases
Uni2Entrez for lookups between UniProt accessions and Entrez IDs
Ref2Entrez for lookups among RefSeq accessions, gene names and Entrez IDs
prepGO for gene ontology
prepMSig for molecular signatures
prepString and anal_prnString for STRING-DB

Column keys in PSM, peptide and protein outputs
system.file("extdata", "psm_keys.txt", package = "proteoQ")
system.file("extdata", "peptide_keys.txt", package = "proteoQ")
system.file("extdata", "protein_keys.txt", package = "proteoQ")

Examples


# ===================================
# Heat maps of GSPA
# ===================================

## !!!require the brief working example in `?load_expts`

## global option
scale_log2r <- TRUE

## prerequisites in significance and enrichment tests
# (see also ?prnSig, ?prnGSPA)
pepSig(
  impute_na = FALSE, 
  W2_bat = ~ Term["(W2.BI.TMT2-W2.BI.TMT1)", 
                  "(W2.JHU.TMT2-W2.JHU.TMT1)", 
                  "(W2.PNNL.TMT2-W2.PNNL.TMT1)"], # batch effects
  W2_loc = ~ Term_2["W2.BI-W2.JHU", 
                    "W2.BI-W2.PNNL", 
                    "W2.JHU-W2.PNNL"], # location effects
  W16_vs_W2 = ~ Term_3["W16-W2"], 
)

prnSig(impute_na = FALSE)

prnGSPA(
  pval_cutoff = 5E-2,
  logFC_cutoff = log2(1.2),
  gspval_cutoff = 5E-2,
  gset_nms = c("go_sets", "kegg_sets"),
  impute_na = FALSE,
)

# ===================================
# Distance heat maps of gene sets
# (also interactive networks)
# ===================================
# a `term` is a subset of an `ess_term` if the distance is zero
prnGSPAHM(
  filter2_by = exprs(distance <= .6),
  annot_cols = "ess_idx",
  annot_colnames = "Eset index",
  annot_rows = "ess_size",
  filename = show_some_redundancy.png,
)

# human terms only
prnGSPAHM(
  filter2_by_dist = exprs(distance <= .95),
  filter2_by_sp = exprs(start_with_str("hs", term)),
  annot_cols = "ess_idx",
  annot_colnames = "Eset index",
  filename = show_more_connectivity.png,
)

# custom color palette
prnGSPAHM(
  annot_cols = c("ess_idx", "ess_size"),
  annot_colnames = c("Eset index", "Size"),
  filter2_by = exprs(distance <= .95),
  color = colorRampPalette(c("blue", "white", "red"))(100),
  filename = custom_colors.png,
)


qzhang503/proteoQ documentation built on Dec. 14, 2024, 12:27 p.m.