avoid_gaps_update: Is is a function designed to remove values <= to...

Description Usage Arguments Value Examples

View source: R/Utilities.R

Description

Is is a function designed to remove values <= to 'gaps_threshold'. Nucleotides local and global positions, bins, size of regions/genes and exons will be recalculated. To use on metagene's table during RNA-seq analysis. Not made for ChIP-Seq analysis or to apply on matagene's data_frame. A similar function is implemented in produce_data_frame() with same arguments. The unique goal of this function is to allow permutation_test which match the plot created using avoid_gaps, bam_name and gaps_threshold arguments in the produce_data_frame function.

Usage

1
avoid_gaps_update(table, bam_name, gaps_threshold = 0)

Arguments

table

A data.table from produce_table(...) function of metagene.

bam_name

A reference bam_name to allow the same removal (position in bam) of values for other bam file.

gaps_threshold

A threshold under which values will be removed.

Value

A data.table with values <= to 'gaps_threshold' removed

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
## Not run: 
bam_files <- c(
system.file("extdata/c_al4_945MLM_demo_sorted.bam", package="metagene"),
system.file("extdata/c_al3_362PYX_demo_sorted.bam", package="metagene"),
system.file("extdata/n_al4_310HII_demo_sorted.bam", package="metagene"),
system.file("extdata/n_al3_588WMR_demo_sorted.bam", package="metagene"))
region <- c(
system.file("extdata/ENCFF355RXX_DPM1less.bed", package="metagene"),
system.file("extdata/ENCFF355RXX_NDUFAB1less.bed", package="metagene"),
system.file("extdata/ENCFF355RXX_SLC25A5less.bed", package="metagene"))
mydesign <- matrix(c(1,1,0,0,0,0,1,1),ncol=2, byrow=FALSE)
mydesign <- cbind(c("c_al4_945MLM_demo_sorted.bam",
                   "c_al3_362PYX_demo_sorted.bam",
                   "n_al4_310HII_demo_sorted.bam",
                   "n_al3_588WMR_demo_sorted.bam"), mydesign)
colnames(mydesign) <- c('Samples', 'cyto', 'nucleo')
mydesign <- data.frame(mydesign)
mydesign[,2] <- as.numeric(mydesign[,2])-1
mydesign[,3] <- as.numeric(mydesign[,3])-1

mg <- metagene$new(regions = region, bam_files = bam_files, 
                                                           assay = 'rnaseq')
mg$produce_table(flip_regions = FALSE, bin_count = 100, 
                               design = mydesign, normalization = 'RPM')
mg$produce_data_frame(avoid_gaps = TRUE, 
                       bam_name = "c_al4_945MLM_demo_sorted", 
                       gaps_threshold = 10)
mg$plot()
tab <- mg$get_table()
tab <- avoid_gaps_update(tab, 
       bam_name = 'c_al4_945MLM_demo_sorted', gaps_threshold = 10)
tab1 <- tab[which(tab0$design == "cyto"),]
tab2 <- tab[which(tab0$design == "nucleo"),]

library(similaRpeak)
perm_fun <- function(profile1, profile2) {
   sim <- similarity(profile1, profile2)
   sim[["metrics"]][["RATIO_NORMALIZED_INTERSECT"]]
}

ratio_normalized_intersect <- 
   perm_fun(tab1[, .(moy=mean(value)), by=bin]$moy, 
           tab2[, .(moy=mean(value)), by=bin]$moy)
ratio_normalized_intersect

permutation_results <- permutation_test(tab1, tab2, sample_size = 2,
                               sample_count = 1000, FUN = perm_fun)
hist(permutation_results, 
       main="ratio_normalized_intersect (1=total overlapping area)")
abline(v=ratio_normalized_intersect, col = 'red')
sum(ratio_normalized_intersect >= permutation_results) / 
       length(permutation_results)

## End(Not run)

metagene documentation built on Nov. 8, 2020, 5:37 p.m.