compute_near_integrations: Scans input matrix to find and merge near integration sites.

Description Usage Arguments Details Value Note Examples

View source: R/recalibration-functions.R

Description

\lifecycle

experimental This function scans the input integration matrix to detect eventual integration sites that are too "near" to each other and merges them into single integration sites adjusting their values if needed.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
compute_near_integrations(
  x,
  threshold = 4,
  keep_criteria = "max_value",
  strand_specific = TRUE,
  max_value_column = "seqCount",
  map_as_widget = TRUE,
  map_as_file = TRUE,
  file_path = ".",
  export_widget_path = NULL
)

Arguments

x

A single integration matrix, either with a single "Value" column or multiple value columns corresponding to different quantification types (obtained via comparison_matrix)

threshold

A single integer that represents an absolute number of bases for which two integrations are considered distinct

keep_criteria

While scanning, which integration should be kept? The 2 possible choices for this parameter are:

  • "max_value": keep the integration site which has the highest value (and collapse other values on that integration).

  • "keep_first": keeps the first integration

strand_specific

Should strand be considered? If yes, for example these two integration sites c(chr = "1", strand = "+", integration_locus = 14568) and c(chr = "1", strand = "-", integration_locus = 14568) are considered different and not grouped together.

max_value_column

The column that has to be considered for searching the maximum value

map_as_widget

Produce recalibration map as an HTML widget?

map_as_file

Produce recalibration map as a .tsv file?

file_path

String representing the path were the file will be saved. By default the function produces a folder in the current working directory and generates file names with time stamps.

export_widget_path

A path on disk to save produced widgets or NULL if the user doesn't wish to save the html file

Details

The whole matrix is scanned with a sliding window mechanism: for each row in the integration matrix an interval is calculated based on the threshold value, then a "look ahead" operation is performed to detect subsequent rows which integration locuses fall in the interval. If CompleteAmplificationIDs of the near integrations are different only the locus value (and optionally GeneName and GeneStrand if the matrix is annotated) is modified, otherwise rows with the same id are aggregated and values are summed. If one of the map parameters is set to true the function will also produce a re-calibration map: this data frame contains the reference of pre-recalibration values for chr, strand and integration locus and the value to which that integration was changed to after.

Value

An integration matrix with same or less number of rows

Note

We do recommend to use this function in combination with comparison_matrix to automatically perform re-calibration on all quantification matrices.

Examples

1
2
3
4
5
6
7
8
path <- system.file("extdata", "ex_annotated_ISMatrix.tsv.xz",
    package = "ISAnalytics"
)
matrix <- import_single_Vispa2Matrix(path)
near <- compute_near_integrations(matrix,
    map_as_widget = FALSE,
    map_as_file = FALSE
)

ISAnalytics documentation built on April 9, 2021, 6:01 p.m.