unique_granges: Determine all unique rows in GRanges object

Description Usage Arguments Details Author(s) Examples

View source: R/unique_granges.R

Description

Given a GRanges object, this function returns all unique rows, or observations, including the meta data information. Calling the function unique() on a GRanges object only returns unique ranges, and does not account for meta data information, unlike to a data.frame. Using the option sum.counts = TRUE and specifying the counts.col = "" name, will sum the numerical values within the column for all combined rows.

Usage

1
2
3
unique_granges(sites)

unique_granges(sites, sum.cols = FALSE, rm.cols = NULL, rm.dup.cols = NULL)

Arguments

sites

A GRanges object with or without metadata columns.

sum.cols

a logical or character vector of column name(s) in the metadata of the input GRanges object to sum across unique observations. These column(s) will not be considered when identifying unique observations. Default is FALSE, if set to TRUE, then will attempt to sum the column 'counts'.

rm.cols

a character vector of column name(s) to remove from the metadata of the input GRanges object. Removal of columns will occur prior to identifying unique observations, and therefore these columns will not only be dropped from the output, but will not be considered in identifying unique observations.

rm.dup.cols

logical or a regular expression to identify duplicate columns and remove the duplicates. This will only remove columns that are true duplicates, meaning they have identical content. Removal of duplicated columns will occur prior to identifying unique observations, and therefore identified duplicates will not only be dropped from the output, but will not be considered in identifying unique observations.

Details

unique_granges returns a GRanges object of only unique observations ( removing all duplicated rows), yet considers the meta data information.

Author(s)

Christopher L. Nobles, Ph.D.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
gr <- gintools:::generate_test_granges(
  n_sites = 1,
  n_reads_p_site = 12,
  site_range = 1:20,
  read_width_range = 20:30
)
gr <- refine_breakpoints(gr)
gr <- standardize_sites(gr)
gr$sample <- rep(c("A","B"), 6)
gr$sample.1 <- rep(c("A","B"), 6)
gr$counts <- rep(1:4, c(3,3,3,3))

# Calling unique() on gr returns a miss interpreted data set
unique(gr)

# Using unique_granges() without options returns all distinct rows
unique_granges(gr)

# Using the options for sum.cols, rows 'counts' are added when combined 
# together.
unique_granges(gr, sum.cols = TRUE)

# Or multiple columns can be added simultaneously.
gr$tags <- rep(1:2, c(6,6))
unique_granges(gr, sum.cols = c("counts", "tags"))

# Remove specific columns by passing a character vector to 'rm.cols'
unique_granges(gr, sum.cols = c("counts", "tags"), rm.cols = "sample.1")

# Remove any column that may be a duplicate of another with a specific 
# name pattern. Column content will be checked for identity before removal.
unique_granges(gr, sum.cols = c("counts", "tags"), rm.dup.cols = "[\\w]+")

cnobles/gintools documentation built on Aug. 22, 2019, 10:36 a.m.