anal_prnString: STRING outputs of protein-protein interactions

anal_prnStringR Documentation

STRING outputs of protein-protein interactions

Description

anal_prnString prepares the data of both STRING protein-protein interactions (ppi) and companion protein expressions.

Usage

anal_prnString(
  scale_log2r = TRUE,
  complete_cases = FALSE,
  impute_na = FALSE,
  db_nms = NULL,
  score_cutoff = 0.7,
  df = NULL,
  filepath = NULL,
  filename = NULL,
  ...
)

Arguments

scale_log2r

Not currently used. Values before and after scaling will be both reported.

complete_cases

Logical; if TRUE, only cases that are complete with no missing values will be used. The default is FALSE.

impute_na

Logical; if TRUE, data with the imputation of missing values will be used. The default is FALSE.

db_nms

Character string(s) to the name(s) of STRING database(s) with prepended directory path. The STRING database(s) need to match those generated from prepString. There is no default and users need to provide the correct file path(s) and name(s).

score_cutoff

Numeric; the threshold in the combined_score of protein-protein interaction. The default is 0.7.

df

The name of a primary data file. By default, it will be determined automatically after matching the types of data and analysis with an id among c("pep_seq", "pep_seq_mod", "prot_acc", "gene"). A primary file contains normalized peptide or protein data and is among c("Peptide.txt", "Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt"). For analyses require the fields of significance p-values, the df will be one of c("Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt").

filepath

Use system default.

filename

Use system default. Otherwise, the user-provided basename will be prepended with _ppi.tsv for network data and _expr.tsv for expression data.

...

filter_: Variable argument statements for the row filtration against data in a primary file linked to df. See also normPSM for the format of filter_ statements.

arrange_: Variable argument statements for the row ordering against data in a primary file linked to df. See also prnHM for the format of arrange_ statements.

Details

The ppi file, [...]_ppi.tsv, and the expression file, [...]_expr.tsv, are also compatible with Cytoscape.

See Also

prepString for database downloads and preparation.

Examples


# ===================================
# String DB
# ===================================

## !!!require the brief working example in `?load_expts`

library(proteoQ)

# `human` and `mouse` STRING using default urls;
prepString(human)
prepString(mouse)

# custom `human` and `mouse` STRING
prepString(
  species = does_not_matter_at_custom_urls,
  links_url = "https://stringdb-static.org/download/protein.links.full.v11.0/9606.protein.links.full.v11.0.txt.gz",
  aliases_url = "https://stringdb-static.org/download/protein.aliases.v11.0/9606.protein.aliases.v11.0.txt.gz",
  info_url = "https://stringdb-static.org/download/protein.info.v11.0/9606.protein.info.v11.0.txt.gz", 
  filename = my_hs.rds,
)

prepString(
  # species = this_mouse,
  links_url = "https://stringdb-static.org/download/protein.links.full.v11.0/10090.protein.links.full.v11.0.txt.gz",
  aliases_url = "https://stringdb-static.org/download/protein.aliases.v11.0/10090.protein.aliases.v11.0.txt.gz",
  info_url = "https://stringdb-static.org/download/protein.info.v11.0/10090.protein.info.v11.0.txt.gz", 
  filename = my_mm.rds,
)


## Not run: 
identical(
  readRDS(file.path("~/proteoQ/dbs/string/string_hs.rds")), 
  readRDS(file.path("~/proteoQ/dbs/string/my_hs.rds"))
)

## End(Not run)


# analysis: both `human` and `mouse`
anal_prnString(
  db_nms = c("~/proteoQ/dbs/string/string_hs.rds",
             "~/proteoQ/dbs/string/string_mm.rds"),
  score_cutoff = .9,
  filter_prots_by = exprs(prot_n_pep >= 2),
)


# `human` only ('unknown' species will be removed)
# OK to include both `string_hs.rds` and `string_mm.rds`
anal_prnString(
  db_nms = c("~/proteoQ/dbs/string/string_hs.rds",
             "~/proteoQ/dbs/string/string_mm.rds"),
  score_cutoff = .9,
  filter_by_sp = exprs(species == "human"),
  filter_prots_by = exprs(prot_n_pep >= 2),
  filename = human.tsv,
)

# `mouse` only
anal_prnString(
  db_nms = c("~/proteoQ/dbs/string/string_hs.rds",
             "~/proteoQ/dbs/string/string_mm.rds"),
  score_cutoff = .9,
  filter_by_sp = exprs(species == "mouse"),
  filter_prots_by = exprs(prot_n_pep >= 2),
  filename = mouse.tsv,
)


# additional filtration by `pVals` and `log2FC`; 
# `W16_vs_W2.pVal (W16-W2)` is a column key in `Protein_pVals.txt`
anal_prnString(
  db_nms = "~/proteoQ/dbs/string/string_hs.rds",
  score_cutoff = .9,
  filter_by_sp = exprs(species == "human", 
                       `W16_vs_W2.pVal (W16-W2)` <= 1E-6,
                       abs(`W16_vs_W2.log2Ratio (W16-W2)`) >= 1.2),
  filter_prots_by = exprs(prot_n_pep >= 2),
  filename = human_sigs.tsv,
)

anal_prnString(
  db_nms = "~/proteoQ/dbs/string/string_mm.rds",
  score_cutoff = .9,
  filter_by_sp = exprs(species == "mouse", 
                       `W16_vs_W2.pVal (W16-W2)` <= 1E-6,
                       abs(`W16_vs_W2.log2Ratio (W16-W2)`) >= 1.2),
  filter_prots_by = exprs(prot_n_pep >= 2),
  filename = mouse_sigs.tsv,
)

# can incorporate `prepString` into `anal_prnString`
anal_prnString(
  db_nms = c(prepString(human),
             prepString(mouse)),
  score_cutoff = .9,
  filter_prots_by = exprs(prot_n_pep >= 2),
  filename = one_pot.tsv,
)


qzhang503/proteoQ documentation built on Sept. 5, 2024, 3:24 p.m.