disambiguate_protein_table_by_gene: given some table with protein-level data, map to HGNC genes...

View source: R/export_stats_genesummary.R

disambiguate_protein_table_by_geneR Documentation

given some table with protein-level data, map to HGNC genes and deal with redundant or ambiguous gene-level results

Description

The order of the data table is pivotal: the first row for each unique gene (criteria in parameter gene_ambiguity)) is selected !

Usage

disambiguate_protein_table_by_gene(
  protein_data,
  hgnc,
  gene_ambiguity,
  xref = NULL,
  remove_nohgnc = FALSE,
  distinct_factors = NULL
)

Arguments

protein_data

a data.frame with columns (from dataset$proteins) protein_id, gene_symbols, gene_symbols_or_id. Importantly, if you opt to return only unique entries per gene then the row of the FIRST matching protein_id is returned, so sort your data table by pvalue upstream !

hgnc

see export_stats_genesummary()

gene_ambiguity

see export_stats_genesummary()

xref

see export_stats_genesummary()

remove_nohgnc

see export_stats_genesummary()

distinct_factors

a set of columns in protein_data that should be considered (together with protein_id) factors that describe subsets of data (to return unique gene-level data within)


ftwkoopmans/msdap documentation built on March 5, 2025, 12:15 a.m.