bed_jaccard: Calculate the Jaccard statistic for two sets of intervals.

Description Usage Arguments Details Value See Also Examples

View source: R/bed_jaccard.r

Description

Quantifies the extent of overlap between to sets of intervals in terms of base-pairs. Groups that are shared between input are used to calculate the statistic for subsets of data.

Usage

1

Arguments

x

tbl_interval()

y

tbl_interval()

Details

The Jaccard statistic takes values of [0,1] and is measured as:

J(x,y) = \frac{\mid x \bigcap y \mid} {\mid x \bigcup y \mid} = \frac{\mid x \bigcap y \mid} {\mid x \mid + \mid y \mid - \mid x \bigcap y \mid}

Interval statistics can be used in combination with dplyr::group_by() and dplyr::do() to calculate statistics for subsets of data. See vignette('interval-stats') for examples.

Value

tibble with the following columns:

If inputs are grouped, the return value will contain one set of values per group.

See Also

http://bedtools.readthedocs.org/en/latest/content/tools/jaccard.html

Other interval statistics: bed_absdist, bed_fisher, bed_projection, bed_reldist

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
genome <- read_genome(valr_example('hg19.chrom.sizes.gz'))

x <- bed_random(genome, seed = 1010486)
y <- bed_random(genome, seed = 9203911)

bed_jaccard(x, y)

# calculate jaccard per chromosome
bed_jaccard(dplyr::group_by(x, chrom),
            dplyr::group_by(y, chrom))

rnabioco/valr documentation built on Jan. 6, 2019, 9:06 a.m.