Home

/

CRAN

/

tidyestimate

/

filter_common_genes: Remove non-common genes from data frame

filter_common_genes: Remove non-common genes from data frame
In tidyestimate: A Tidy Implementation of 'ESTIMATE'

filter_common_genes

R Documentation

Remove non-common genes from data frame

Description

As ESTIMATE score calculation is sensitive to the number of genes used, a set of common genes used between six platforms has been established (see ?tidyestimate::common_genes). This function will filter for only those genes.

Usage

filter_common_genes(
  df,
  id = c("entrezgene_id", "hgnc_symbol"),
  tidy = FALSE,
  tell_missing = TRUE,
  find_alias = FALSE
)

Arguments

`df`	a `data.frame` of RNA expression values, with columns corresponding to samples, and rows corresponding to genes. Either rownames or the first column can contain gene IDs (see `tidy`)
`id`	either `"entrezgene_id"` or `"hgnc_symbol"`, whichever `df` contains.
`tidy`	logical. If rownames contain gene identifier, set `FALSE`. If first column contains gene identifier, set `TRUE`
`tell_missing`	logical. If `TRUE`, prints message of genes in common gene set that are not in supplied data frame.
`find_alias`	logical. If `TRUE` and `id = "hgnc_symbol"`, will attempt to find if genes missing from `common_genes` are going under an alias. See details for more information.

Details

The find_aliases argument will attempt to find aliases for HGNC symbols in tidyestimate::common_genes but missing from the provided dataset. This will only run if find_aliases = TRUE and id = "hgnc_symbol".

This algorithm is very conservative: It will only make a match if the gene from the common genes has only one alias that matches with only one gene from the provided dataset, and the gene from the provided dataset with which it matches only matches with a single gene from the list of common genes. (Note that a single gene may have many aliases). Once a match has been made, the gene in the provided dataset is updated to the gene name in the common gene list.

While this method is fairly accurate, is is also a heuristic. Therefore, it is disabled by default. Users should check which genes are becoming reassigned to ensure accuracy.

The method of generation of these aliases can be found at ?tidyestimate::common_genes

Value

A tibble, with gene identifiers as the first column

Examples

filter_common_genes(ov, id = "hgnc_symbol", tidy = FALSE, tell_missing = TRUE, find_alias = FALSE)

tidyestimate documentation built on Aug. 21, 2023, 9:08 a.m.

tidyestimate index

README.md Using tidyestimate

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

tidyestimate
A Tidy Implementation of 'ESTIMATE'

filter_common_genes: Remove non-common genes from data frame
In tidyestimate: A Tidy Implementation of 'ESTIMATE'

Remove non-common genes from data frame

Description

Usage

Arguments

Details

Value

Examples

Related to filter_common_genes in tidyestimate...

R Package Documentation

Browse R Packages

We want your feedback!

tidyestimate A Tidy Implementation of 'ESTIMATE'

filter_common_genes: Remove non-common genes from data frame In tidyestimate: A Tidy Implementation of 'ESTIMATE'

Remove non-common genes from data frame

Description

Usage

Arguments

Details

Value

Examples

Related to filter_common_genes in tidyestimate...

R Package Documentation

Browse R Packages

We want your feedback!

tidyestimate
A Tidy Implementation of 'ESTIMATE'

filter_common_genes: Remove non-common genes from data frame
In tidyestimate: A Tidy Implementation of 'ESTIMATE'