HGNC.EnforceUnique: Enforce Unique HGNC Gene Symbols

View source: R/Seurat.Utils.R

HGNC.EnforceUniqueR Documentation

Enforce Unique HGNC Gene Symbols

Description

Ensures that gene symbols are unique after being updated with HGNC symbols. This function applies a suffix to duplicate gene symbols to enforce uniqueness. While using make.unique might not be the ideal solution due to potential mismatches, it significantly reduces the number of mismatching genes in certain scenarios, making it a practical approach for data integration tasks.

Usage

HGNC.EnforceUnique(updatedSymbols)

Arguments

updatedSymbols

A data frame or matrix containing gene symbols updated via HGNChelper::checkGeneSymbols(). The third column should contain the updated gene symbols that are to be made unique.

Details

The function specifically targets the issue of duplicate gene symbols which can occur after updating gene symbols to their latest HGNC-approved versions. Duplicate symbols can introduce ambiguity in gene expression datasets, affecting downstream analyses like differential expression or data integration. By ensuring each gene symbol is unique, this function helps maintain the integrity of the dataset.

Value

A modified version of the input data frame or matrix with unique gene symbols in the third column. If duplicates were found, they are made unique by appending .1, .2, etc., to the repeated symbols.

Note

This function is a workaround for ensuring unique gene symbols and might not be suitable for all datasets or analyses. It's important to review the results and ensure that the gene symbols accurately represent your data.

Examples

## Not run: 
if (interactive()) {
  # Assuming `SymUpd` is your data frame of updated symbols from HGNChelper::checkGeneSymbols()
  uniqueSymbols <- HGNC.EnforceUnique(updatedSymbols = SymUpd)
  # `uniqueSymbols` now contains unique gene symbols in its third column
}

## End(Not run)


vertesy/Seurat.utils documentation built on Dec. 4, 2024, 5:20 p.m.