summarize_categorical: Summarize categorical data over groupings of annotated...
In rcavalcante/annotatr: Annotation of Genomic Regions to Genomic Annotations

summarize_categorical

R Documentation

Summarize categorical data over groupings of annotated regions

Description

Given a GRanges of annotated regions, count the number of regions when the annotations are grouped by categorical columns.

Usage

summarize_categorical(
  annotated_regions,
  by = c("annot.type", "annot.id"),
  quiet = FALSE
)

Arguments

`annotated_regions`	The `GRanges` result of `annotate_regions()`.
`by`	A character vector to group the data in `as.data.frame(annotated_regions)` by and tally over. Default is `c('annot.type', 'annot.id')`.
`quiet`	Print progress messages (FALSE) or not (TRUE).

Details

If a region is annotated to multiple annotations of the same annot.type, the region will only be counted once. For example, if a region were annotated to multiple exons, it would only count once toward the exons, but if it were annotated to an exon and an intron, it would count towards both.

Value

A grouped dplyr::tbl_df of the counts of groupings according to the by vector.

Examples


   # Get premade CpG annotations
   data('annotations', package = 'annotatr')

   r_file = system.file('extdata', 'test_read_multiple_data_nohead.bed', package='annotatr')
   extraCols = c(pval = 'numeric', mu1 = 'integer', mu0 = 'integer', diff_exp = 'character')
   r = read_regions(con = r_file, genome = 'hg19', extraCols = extraCols, rename_score = 'coverage')

   a = annotate_regions(
       regions = r,
       annotations = annotations,
       ignore.strand = TRUE)

   sc = summarize_categorical(
       annotated_regions = a,
       by = c('annot.type', 'name'),
       quiet = FALSE)

rcavalcante/annotatr documentation built on Aug. 22, 2024, 7:33 a.m.