assignToGenes: Assign binding sites to their hosting genes

View source: R/workflow.R

assignToGenesR Documentation

Assign binding sites to their hosting genes

Description

Function that assigns each binding site in the BSFDataSet to its hosting gene given a gene annotation (anno.annoDB, anno.genes).

Usage

assignToGenes(
  object,
  overlaps = c("frequency", "hierarchy", "remove", "keep"),
  overlaps.rule = NULL,
  anno.annoDB = NULL,
  anno.genes = NULL,
  match.geneID = "gene_id",
  match.geneName = "gene_name",
  match.geneType = "gene_type",
  quiet = FALSE
)

Arguments

object

a BSFDataSet object with stored binding sites. This means that ranges should be > 1

overlaps

character; how overlapping gene loci should be handled.

overlaps.rule

character vector; a vector of gene type that should be used to handle overlapping cases in a hierarchical manor. The order of the vector is the order of the hierarchy.

anno.annoDB

an object of class OrganismDbi that contains the gene annotation (!!! Experimental !!!).

anno.genes

an object of class GenomicRanges that represents the gene ranges directly

match.geneID

character; meta column name of the gene ID

match.geneName

character; meta column name of the gene name

match.geneType

character; meta column name of the gene type

quiet

logical; whether to print messages

Details

Regardless of the annotation source that is being used, the respective meta information about the genes have to be present. They can be set by the match.geneID, match.geneName and match.geneType arguments.

In the case of overlapping gene annotation, a single binding site will be associated with multiple genes. The overlaps parameter allows to decide in these cases. Option 'frequency' will take the most frequently observed gene type, option 'hierarchy' works in conjunction with a user defined rule (overlaps.rule). Options 'remove' and 'keep' will remove or keep all overlapping cases, respectively.

Note that if an overlaps exists, but gene types are identical options 'frequency' and 'hierarchy' will cause the gene that was seen first to be selected as representative.

The function is part of the standard workflow performed by BSFind.

Value

an object of class BSFDataSet with binding sites having hosting gene information added to their meta columns.

See Also

BSFind, geneOverlapsPlot, targetGeneSpectrumPlot

Examples

# load clip data
files <- system.file("extdata", package="BindingSiteFinder")
load(list.files(files, pattern = ".rda$", full.names = TRUE))
# Load GRanges with genes
load(list.files(files, pattern = ".rds$", full.names = TRUE)[1])
bds = makeBindingSites(object = bds, bsSize = 9)
bds = assignToGenes(bds, anno.genes = gns)


ZarnackGroup/BindingSiteFinder documentation built on Nov. 24, 2024, 10:41 a.m.