stringTest: String analysis

stringTestR Documentation

String analysis

Description

The input df contains pVal fields.

Usage

stringTest(
  df = NULL,
  id = gene,
  label_scheme_sub = NULL,
  db_nms = NULL,
  score_cutoff = 0.7,
  scale_log2r = TRUE,
  complete_cases = FALSE,
  filepath = NULL,
  filename = NULL,
  ...
)

Arguments

df

The name of a primary data file. By default, it will be determined automatically after matching the types of data and analysis with an id among c("pep_seq", "pep_seq_mod", "prot_acc", "gene"). A primary file contains normalized peptide or protein data and is among c("Peptide.txt", "Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt"). For analyses require the fields of significance p-values, the df will be one of c("Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt").

id

Character string; one of pep_seq, pep_seq_mod, prot_acc and gene.

label_scheme_sub

A data frame. Subset entries from label_scheme for selected samples.

db_nms

Character string(s) to the name(s) of STRING database(s) with prepended directory path. The STRING database(s) need to match those generated from prepString. There is no default and users need to provide the correct file path(s) and name(s).

score_cutoff

Numeric; the threshold in the combined_score of protein-protein interaction. The default is 0.7.

scale_log2r

Logical; if TRUE, adjusts log2FC to the same scale of standard deviation across all samples. The default is TRUE. At scale_log2r = NA, the raw log2FC without normalization will be used.

complete_cases

Logical; if TRUE, only cases that are complete with no missing values will be used. The default is FALSE.

filepath

A file path to output results. By default, it will be determined automatically by the name of the calling function and the value of id in the call.

filename

A representative file name to outputs. By default, the name(s) will be determined automatically. For text files, a typical file extension is .txt. For image files, they are typically saved via ggsave or pheatmap where the image type will be determined by the extension of the file name.

...

filter_: Variable argument statements for the row filtration of data against the column keys in Peptide.txt for peptides or Protein.txt for proteins. Each statement contains to a list of logical expression(s). The lhs needs to start with filter_. The logical condition(s) at the rhs needs to be enclosed in exprs with round parenthesis.

For example, pep_len is a column key in Peptide.txt. The statement filter_peps_at = exprs(pep_len <= 50) will remove peptide entries with pep_len > 50. See also normPSM.

Additional parameters for plotting with ggplot2:
xmin, the minimum x at a log2 scale; the default is -2.
xmax, the maximum x at a log2 scale; the default is +2.
xbreaks, the breaks in x-axis at a log2 scale; the default is 1.
binwidth, the binwidth of log2FC; the default is (xmax - xmin)/80.
ncol, the number of columns; the default is 1.
width, the width of plot;
height, the height of plot.
scales, should the scales be fixed across panels; the default is "fixed" and the alternative is "free".

Details

The argument scale_log2r is not used in that both '_N' and '_Z' columns from primary df will be kept. The argument species is used for the generation of separate outputs by species.


qzhang503/proteoQ documentation built on Dec. 14, 2024, 12:27 p.m.