anal_pepNMF: NMF Classification
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

anal_pepNMF

R Documentation

NMF Classification

Description

anal_pepNMF performs the NMF classification of peptide log2FC. The function is a wrapper of nmf.

anal_prnNMF performs the NMF classification of protein log2FC. The function is a wrapper of nmf.

Usage

anal_pepNMF(
  col_select = NULL,
  col_group = NULL,
  scale_log2r = TRUE,
  complete_cases = FALSE,
  impute_na = TRUE,
  df = NULL,
  filepath = NULL,
  filename = NULL,
  rank = NULL,
  nrun = if (length(rank) > 1) 50 else 1,
  seed = NULL,
  ...
)

anal_prnNMF(
  col_select = NULL,
  col_group = NULL,
  scale_log2r = TRUE,
  complete_cases = FALSE,
  impute_na = TRUE,
  df = NULL,
  filepath = NULL,
  filename = NULL,
  rank = NULL,
  nrun = if (length(rank) > 1) 50 else 1,
  seed = NULL,
  ...
)

Arguments

`col_select`	Character string to a column key in `expt_smry.xlsx`. At the `NULL` default, the column key of `Select` in `expt_smry.xlsx` will be used. In the case of no samples being specified under `Select`, the column key of `Sample_ID` will be used. The non-empty entries under the ascribing column will be used in indicated analysis.
`col_group`	Character string to a column key in `expt_smry.xlsx`. Samples corresponding to non-empty entries under `col_group` will be used for sample grouping in the indicated analysis. At the NULL default, the column key `Group` will be used. No data annotation by groups will be performed if the fields under the indicated group column is empty.
`scale_log2r`	Logical; if TRUE, adjusts `log2FC` to the same scale of standard deviation across all samples. The default is TRUE. At `scale_log2r = NA`, the raw `log2FC` without normalization will be used.
`complete_cases`	Logical; if TRUE, only cases that are complete with no missing values will be used. The default is FALSE.
`impute_na`	Logical; if TRUE, data with the imputation of missing values will be used. The default is TRUE.
`df`	The name of a primary data file. By default, it will be determined automatically after matching the types of data and analysis with an `id` among `c("pep_seq", "pep_seq_mod", "prot_acc", "gene")`. A primary file contains normalized peptide or protein data and is among `c("Peptide.txt", "Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt")`. For analyses require the fields of significance p-values, the `df` will be one of `c("Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt")`.
`filepath`	Use system default.
`filename`	A representative file name to outputs. By default, it will be determined automatically by the name of the current call.
`rank`	Numeric vector; the factorization rank(s) in `nmf`. The default is c(4:8)
`nrun`	Numeric; the number of runs in `nmf`. The default is 50.
`seed`	Integer; a seed for reproducible analysis.
`...`	`filter_`: Logical expression(s) for the row filtration against data in a primary file linked to `df`. See also `normPSM` for the format of `filter_` statements. `arrange_`: Variable argument statements for the row ordering against data in a primary file linked to `df`. See also `prnHM` for the format of `arrange_` statements. Additional arguments for `nmf`.

Details

The option of complete_cases will be forced to TRUE at impute_na = FALSE.

Value

NMF classification of log2FC data.

Examples


# ===================================
# NMF
# ===================================

## !!!require the brief working example in `?load_expts`

## global option
scale_log2r <- TRUE

library(NMF)

# ===================================
# Analysis
# ===================================
## base (proteins)
library(NMF)

anal_prnNMF(
  impute_na = FALSE,
  col_group = Group,
  rank = c(3:4),
  nrun = 20, 
)

# passing a different `method`
anal_prnNMF(
  impute_na = FALSE,
  col_group = Group,
  method = "lee",
  rank = c(3:4),
  nrun = 20, 
  filename = lee.txt,
)

## row filtration and selected samples (proteins)
anal_prnNMF(
  impute_na = FALSE,
  col_select = BI,
  col_group = Group,
  rank = c(3:4),
  nrun = 20, 
  filter_prots = exprs(prot_n_pep >= 3),
  filename = bi_npep3.txt,
)

## additional row filtration by pVals (proteins, impute_na = FALSE)
# if not yet, run prerequisitive significance tests at `impute_na = FALSE`
pepSig(
  impute_na = FALSE, 
  W2_bat = ~ Term["(W2.BI.TMT2-W2.BI.TMT1)", 
                  "(W2.JHU.TMT2-W2.JHU.TMT1)", 
                  "(W2.PNNL.TMT2-W2.PNNL.TMT1)"],
  W2_loc = ~ Term_2["W2.BI-W2.JHU", 
                    "W2.BI-W2.PNNL", 
                    "W2.JHU-W2.PNNL"],
  W16_vs_W2 = ~ Term_3["W16-W2"], 
)

prnSig(impute_na = FALSE)

# (`W16_vs_W2.pVal (W16-W2)` now a column key)
anal_prnNMF(
  impute_na = FALSE,
  col_group = Group,
  rank = c(3:4),
  nrun = 20, 
  filter_prots_by_npep = exprs(prot_n_pep >= 3), 
  filter_prots_by_pval = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
  filename = pval.txt,
)

## additional row filtration by pVals (impute_na = TRUE)
# if not yet, run prerequisitive NA imputation and corresponding 
# significance tests at `impute_na = TRUE`
pepImp(m = 2, maxit = 2)
prnImp(m = 5, maxit = 5)

pepSig(
  impute_na = TRUE, 
  W2_bat = ~ Term["(W2.BI.TMT2-W2.BI.TMT1)", 
                  "(W2.JHU.TMT2-W2.JHU.TMT1)", 
                  "(W2.PNNL.TMT2-W2.PNNL.TMT1)"],
  W2_loc = ~ Term_2["W2.BI-W2.JHU", 
                    "W2.BI-W2.PNNL", 
                    "W2.JHU-W2.PNNL"],
  W16_vs_W2 = ~ Term_3["W16-W2"], 
)

prnSig(impute_na = TRUE)

anal_prnNMF(
  impute_na = TRUE,
  col_group = Group,
  rank = c(3:4),
  nrun = 20, 
  filter_prots_by_npep = exprs(prot_n_pep >= 3), 
  filter_prots_by_pval = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
  filename = pval2.txt,
)

## analogous peptides
anal_pepNMF(
  impute_na = TRUE,
  col_group = Group,
  rank = c(3:4),
  nrun = 20, 
  filter_prots_by_npep = exprs(prot_n_pep >= 3), 
  filter_prots_by_pval = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
)

anal_pepNMF(
  impute_na = FALSE,
  col_group = Group,
  rank = c(3:4),
  nrun = 20, 
  filter_prots_by_npep = exprs(prot_n_pep >= 3), 
  filter_prots_by_pval = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
)


# ===================================
# consensus heat maps
# ===================================
## no NA imputation 
# proteins, all available ranks
library(NMF)

plot_prnNMFCon(
  impute_na = FALSE,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  width = 14,
  height = 14,
)

# analogous peptides
plot_pepNMFCon(
  impute_na = FALSE,
  col_select = BI,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  color = colorRampPalette(RColorBrewer::brewer.pal(n = 7, name = "Spectral"))(50), 
  width = 10,
  height = 10,
  filename = bi.pdf,
)

# manual selection of input data file(s)
# may be used for optimizing individual plots
plot_prnNMFCon(
  df2 = c("Protein_NMF_Z_rank3_consensus.txt", "Protein_NMF_Z_rank4_consensus.txt"),
  impute_na = FALSE,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  width = 14,
  height = 14,
)

## NA imputation 
# proteins, all available ranks
plot_prnNMFCon(
  impute_na = TRUE,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  width = 14,
  height = 14,
)

# analogous peptides
plot_pepNMFCon(
  impute_na = TRUE,
  col_select = BI,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  width = 10,
  height = 10,
  filename = bi_con.png,
)


# ===================================
# coefficient heat maps
# ===================================
## no NA imputation 
# proteins, all available ranks
plot_prnNMFCoef(
  impute_na = FALSE,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  width = 12,
  height = 12,
)

# manual selection of input data file(s)
# may be used for optimizing individual plots
plot_prnNMFCoef(
  df2 = c("Protein_NMF_Z_rank3_coef.txt"),  
  impute_na = FALSE,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  width = 12,
  height = 12,
)

# analogous peptides
plot_pepNMFCoef(
  impute_na = FALSE,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  color = colorRampPalette(brewer.pal(n = 7, name = "Spectral"))(50), 
  width = 12,
  height = 12,
)

## NA imputation 
# proteins, all available ranks
plot_prnNMFCoef(
  impute_na = TRUE,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  width = 10,
  height = 10,
)

# analogous peptides
plot_pepNMFCoef(
  impute_na = TRUE,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  width = 10,
  height = 10,
)


# ===================================
# Metagene heat maps
# ===================================
## no NA imputation 
# proteins, all available ranks
plot_metaNMF(
  impute_na = FALSE,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  
  # additional arguments for `pheatmap`
  fontsize = 8,
  fontsize_col = 5,
)

# proteins, selected sample(s)
plot_metaNMF(
  impute_na = FALSE,
  col_select = BI_1,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  fontsize = 8,
  fontsize_col = 5,
  cellwidth = 6, 
  filename = bi1.png,
)

# proteins, selected sample(s) and row ordering
plot_metaNMF(
  impute_na = FALSE,
  col_select = BI_1,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  fontsize = 8,
  fontsize_col = 5,
  cellwidth = 6, 
  cluster_rows = FALSE,
  arrange_prots_by = exprs(gene),
  filename = bi1_row_by_genes.png,
)

# manual selection of input .rda file(s)
# may be used for optimizing individual plots
plot_metaNMF(
  df2 = c("Protein_NMF_Z_rank3.rda"),  
  impute_na = FALSE,
  col_select = BI_1,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  fontsize = 8,
  fontsize_col = 5,
  cellwidth = 6, 
  cluster_rows = FALSE,
  arrange_prots_by = exprs(gene),
  filename = bi1_row_by_genes.png,
)

## NA imputation 
# proteins, all available ranks
plot_metaNMF(
  impute_na = TRUE,
  annot_cols = c("Color", "Alpha", "Shape"),
  annot_colnames = c("Lab", "Batch", "WHIM"),
  fontsize = 8,
  fontsize_col = 5,
)

qzhang503/proteoQ documentation built on April 13, 2025, 8:31 a.m.

qzhang503/proteoQ index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

anal_pepNMF: NMF Classification
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

NMF Classification

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to anal_pepNMF in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ Processing and Informatic Analysis of Mass Spectrometrirc Data

anal_pepNMF: NMF Classification In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

NMF Classification

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to anal_pepNMF in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

anal_pepNMF: NMF Classification
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data