semantic_copynumber_filter: Remove multicopy genes from up/down gene expression lists.

View source: R/de_shared.R

semantic_copynumber_filterR Documentation

Remove multicopy genes from up/down gene expression lists.

Description

In our parasite data, there are a few gene types which are consistently obnoxious. Multi-gene families primarily where the coding sequences are divergent, but the UTRs nearly identical. For these genes, our sequence based removal methods fail and so this just excludes them by name.

Usage

semantic_copynumber_filter(
  input,
  max_copies = 2,
  use_files = FALSE,
  invert = TRUE,
  semantic = c("mucin", "sialidase", "RHS", "MASP", "DGF", "GP63"),
  semantic_column = "product"
)

Arguments

input

List of sets of genes deemed significantly up/down with a column expressing approximate count numbers.

max_copies

Keep only those genes with <= n putative copies.

use_files

Use a set of sequence alignments to define the copy numbers?

invert

Keep these genes rather than drop them?

semantic

Set of strings with gene names to exclude.

semantic_column

Column in the DE table used to find the semantic strings for removal.

Details

Currently untested, used for Trypanosome analyses primarily, thus the default strings.

Value

Smaller list of up/down genes.

See Also

[semantic_copynumber_extract()]

Examples

## Not run: 
 pruned <- semantic_copynumber_filter(table, semantic = c("ribosomal"))
 ## Get rid of all genes with 'ribosomal' in the annotations.

## End(Not run)

elsayed-lab/hpgltools documentation built on May 9, 2024, 5:02 a.m.