bindingContextDistance: bindingContextDistance

Description Usage Arguments Value Note Examples

View source: R/bindingContextDistance.R

Description

Calculate the Wasserstein distance between two replicates' or two proteins' binding contexts for CapR-generated RNA contexts.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
bindingContextDistance(
  dir_stereogene_output = ".",
  RNA_context,
  protein_file,
  protein_file_input = NULL,
  dir_stereogene_output_2 = NULL,
  RNA_context_2 = NULL,
  protein_file_2 = NULL,
  protein_file_input_2 = NULL,
  range = c(-200, 200)
)

Arguments

dir_stereogene_output

Directory of Stereogene output for first protein. Default current directory.

RNA_context

Name of the RNA context file input to Stereogene. File names must exclude extensions such as ".bedGraph". Requred

protein_file

A vector of at least one protein file name to be averaged for calculation of distance. File names must exclude extensions such as ".bedGraph". All files in the list should be experimental/biological replicates. Required.

protein_file_input

A protein file name of background input to be subtracted from protein_file signal. File name must exclude extension. Only one input file is permitted. Optional.

dir_stereogene_output_2

Directory of Stereogene output for second protein. Default dir_stereogene_output.

RNA_context_2

Name of the RNA context file input to Stereogene. File names must exclude extensions such as ".bedGraph". Default same as RNA_context.

protein_file_2

Similar to protein_file. A second vector of at least one protein file name to be averaged for calculation of distance. File names must exclude extensions such as ".bedGraph". All files in the list should be experimental/biological replicates. Default same as protein_file

protein_file_input_2

Similar to protein_file_input. A second protein file name of background input to be subtracted from protein_file_2 signal. File name must exclude extension. Only one input file is permitted. Optional.

range

A vector of two integers denoting the range upstream and downstream of the center of protein binding to consider in the comparison. Ranges that are too small miss the holistic binding context, while large ranges amplify distal noise in the binding data. Cannot exceed wSize/2 from write_config. Default c(-200, 200)

Value

Wasserstein distance between the two protein file sets provided for the RNA structure context specified, minus the input binding signal if applicable

Note

Either RNA_context_2 or protein_file_2 must be input. Otherwise, the distance would be calculated between the same file and equal 0.

Wasserstein distance calculations are reciprocal, so it does not matter which protein is first or second so long as replicates and input files correspond to one another.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## pull example files
get_outfiles()
## distance between stem and hairpin contexts
bindingContextDistance(RNA_context = "chr4and5_3UTR_stem_liftOver",
                       protein_file = "chr4and5_liftOver",
                       RNA_context_2 = "chr4and5_3UTR_hairpin_liftOver")

## distance between internal and hairpin contexts
bindingContextDistance(RNA_context = "chr4and5_3UTR_internal_liftOver",
                       protein_file = "chr4and5_liftOver",
                       RNA_context_2 = "chr4and5_3UTR_hairpin_liftOver")

vbusa1/nearBynding documentation built on Aug. 4, 2021, 4:08 p.m.