qc_proteome_coverage: Proteome coverage per sample and total

View source: R/qc_proteome_coverage.R

qc_proteome_coverageR Documentation

Proteome coverage per sample and total

Description

Calculates the proteome coverage for each samples and for all samples combined. In other words t he fraction of detected proteins to all proteins in the proteome is calculated.

Usage

qc_proteome_coverage(
  data,
  sample,
  protein_id,
  organism_id,
  reviewed = TRUE,
  plot = TRUE,
  interactive = FALSE
)

Arguments

data

a data frame that contains at least sample names and protein ID's.

sample

a character column in the data data frame that contains the sample name.

protein_id

a character or numeric column in the data data frame that contains protein identifiers such as UniProt accessions.

organism_id

a numeric value that specifies a NCBI taxonomy identifier (TaxId) of the organism used. Human: 9606, S. cerevisiae: 559292, E. coli: 83333.

reviewed

a logical value that determines if only reviewed protein entries will be considered as the full proteome. Default is TRUE.

plot

a logical value that specifies whether the result should be plotted.

interactive

a logical value that indicates whether the plot should be interactive (default is FALSE).

Value

A bar plot showing the percentage of of the proteome detected and undetected in total and for each sample. If plot = FALSE a data frame containing the numbers is returned.

Examples


# Create example data
proteome <- data.frame(id = 1:4518)
data <- data.frame(
  sample = c(rep("A", 101), rep("B", 1000), rep("C", 1000)),
  protein_id = c(proteome$id[1:100], proteome$id[1:1000], proteome$id[1000:2000])
)

# Calculate proteome coverage
qc_proteome_coverage(
  data = data,
  sample = sample,
  protein_id = protein_id,
  organism_id = 83333,
  plot = FALSE
)

# Plot proteome coverage
qc_proteome_coverage(
  data = data,
  sample = sample,
  protein_id = protein_id,
  organism_id = 83333,
  plot = TRUE
)


protti documentation built on Jan. 22, 2023, 1:11 a.m.