break_connecting_source_paths: Break graph paths which connect sources.

Description Usage Arguments Details Author(s) Examples

View source: R/break_connecting_source_paths.R

Description

Given a list of unique integration site positions (reduced GRanges object) and a directed graph of connected components, this function identifies clusters with multiple sources, the paths between those sources, and removes edges along the path so that each cluster only has one source node. Edge removal is first based on nucleotide distance (greater distance prefered), then based on abundance (lowest abundance prefered), then on an upstream bias (downstream connection will be removed when everything ties).

Usage

1
break_connecting_source_paths(red.sites, graph, bias)

Arguments

red.sites

GRanges object which has been reduced to single nt positions and contains the revmap from the original GRanges object. The object must also contain a column for cluster membership (clusID) and a column for abundance (fragLengths).

graph

a directed graph built from the red.sites object. Each node corresponds to a row in the red.sites object.

bias

either "upsteam" or "downstream", designating which position to choose if other decision metrics are tied.

Details

break_connecting_source_paths returns a graph where only one source is present per cluster.

Author(s)

Christopher Nobles, Ph.D.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
gr <- generate_test_granges(stdev = 3)
red.sites <- reduce(
  flank(gr, -1, start = TRUE),
  min.gapwidth = 0L,
  with.revmap = TRUE)
red.sites$siteID <- seq(1:length(red.sites))
revmap <- as.list(red.sites$revmap)
red.sites$abundance <- sapply(revmap, length)
red.hits <- GenomicRanges::as.data.frame(
  findOverlaps(red.sites, maxgap = 0L, drop.self = TRUE))
red.hits <- red.hits %>%
  mutate(q_pos = start(red.sites[queryHits])) %>%
  mutate(s_pos = start(red.sites[subjectHits])) %>%
  mutate(q_abund = red.sites[queryHits]$abundance) %>%
  mutate(s_abund = red.sites[subjectHits]$abundance) %>%
  mutate(strand = unique(strand(
    c(red.sites[queryHits], red.sites[subjectHits])))) %>%
  mutate(is.upstream = ifelse(
    strand == "+",
    q_pos < s_pos,
    q_pos > s_pos)) %>%
  mutate(keep = q_abund > s_abund) %>%
  mutate(keep = ifelse(
    q_abund == s_abund,
    is.upstream,
    keep)) %>%
  filter(keep)
g <- make_empty_graph(n = length(red.sites), directed = TRUE) %>%
  add_edges(unlist(mapply(
    c, red.hits$queryHits, red.hits$subjectHits, SIMPLIFY = FALSE)))
red.sites$clusID <- clusters(g)$membership
g <- connect_satalite_vertices(red.sites, g, gap = 2L, bias = "upstream")
red.sites$clusID <- clusters(g)$membership
break_connecting_source_paths(red.sites, g, "upstream")

cnobles/gintools documentation built on Aug. 22, 2019, 10:36 a.m.