is_sharing: Sharing of integration sites between given groups.
In calabrialab/ISAnalytics: Analyze gene therapy vector insertion sites data identified from genomics next generation sequencing reads for clonal tracking studies

is_sharing

R Documentation

Sharing of integration sites between given groups.

Description

Computes the amount of integration sites shared between the groups identified in the input data.

Usage

is_sharing(
  ...,
  group_key = c("SubjectID", "CellMarker", "Tissue", "TimePoint"),
  group_keys = NULL,
  n_comp = 2,
  is_count = TRUE,
  relative_is_sharing = TRUE,
  minimal = TRUE,
  include_self_comp = FALSE,
  keep_genomic_coord = FALSE,
  table_for_venn = FALSE
)

Arguments

`...`	One or more integration matrices
`group_key`	Character vector of column names which identify a single group. An associated group id will be derived by concatenating the values of these fields, separated by "_"
`group_keys`	A list of keys for asymmetric grouping. If not NULL the argument `group_key` is ignored
`n_comp`	Number of comparisons to compute. This argument is relevant only if provided a single data frame and a single key.
`is_count`	Logical, if `TRUE` returns also the count of IS for each group and the count for the union set
`relative_is_sharing`	Logical, if `TRUE` also returns the relative sharing.
`minimal`	Compute only combinations instead of all possible permutations? If `TRUE` saves time and excludes redundant comparisons.
`include_self_comp`	Include comparisons with the same group?
`keep_genomic_coord`	If `TRUE` keeps the genomic coordinates of the shared integration sites in a dedicated column (as a nested table)
`table_for_venn`	Add column with truth tables for venn plots?

Details

An integration site is always identified by the combination of fields in mandatory_IS_vars(), thus these columns must be present in the input(s).

The function accepts multiple inputs for different scenarios, please refer to the vignette vignette("workflow_start", package = "ISAnalytics") for a more in-depth explanation.

Output

The function outputs a single data frame containing all requested comparisons and optionally individual group counts, genomic coordinates of the shared integration sites and truth tables for plotting venn diagrams.

Plotting sharing

The sharing data obtained can be easily plotted in a heatmap via the function sharing_heatmap or via the function sharing_venn

Value

A data frame

Required tags

The function will explicitly check for the presence of these tags:

All columns declared in mandatory_IS_vars()

Examples

data("integration_matrices", package = "ISAnalytics")
data("association_file", package = "ISAnalytics")
aggreg <- aggregate_values_by_key(
    x = integration_matrices,
    association_file = association_file,
    value_cols = c("seqCount", "fragmentEstimate")
)
sharing <- is_sharing(aggreg)
sharing

calabrialab/ISAnalytics documentation built on Dec. 10, 2024, 10:50 p.m.