summarize_numerical | R Documentation |
Given a GRanges
of annotated regions, summarize numerical data columns based on a grouping.
summarize_numerical(
annotated_regions,
by = c("annot.type", "annot.id"),
over,
quiet = FALSE
)
annotated_regions |
The |
by |
A character vector of the columns of |
over |
A character vector of the numerical columns in |
quiet |
Print progress messages (FALSE) or not (TRUE). |
NOTE: We do not take the distinct values of seqnames
, start
, end
, annot.type
as in the other summarize_*()
functions because in the case of a region that intersected two distinct exons, using distinct()
would destroy the information of the mean of the numerical column over one of the exons, which is not desirable.
A grouped dplyr::tbl_df
, and the count
, mean
, and sd
of the cols
by
the groupings.
### Test on a very simple bed file to demonstrate different options
# Get premade CpG annotations
data('annotations', package = 'annotatr')
r_file = system.file('extdata', 'test_read_multiple_data_nohead.bed', package='annotatr')
extraCols = c(pval = 'numeric', mu1 = 'integer', mu0 = 'integer', diff_exp = 'character')
r = read_regions(con = r_file, genome = 'hg19', extraCols = extraCols, rename_score = 'coverage')
a = annotate_regions(
regions = r,
annotations = annotations,
ignore.strand = TRUE)
# Testing over normal by
sn1 = summarize_numerical(
annotated_regions = a,
by = c('annot.type', 'annot.id'),
over = c('coverage', 'mu1', 'mu0'),
quiet = FALSE)
# Testing over a different by
sn2 = summarize_numerical(
annotated_regions = a,
by = c('diff_exp'),
over = c('coverage', 'mu1', 'mu0'))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.