totals: Returns the total number of reads that map to an ORF and the...

Usage Arguments Value Author(s) Examples

Usage

1
totals(data, genedf, plot = F, cores = 1)

Arguments

data

object returned by read_profiling_data

genedf

data frame with the start and stop coordinated of the ORFs in the transcriptome.

plot

logical, whether to plot a scatter plot of results, defaults to F

cores

integer, number of threads to use. See details.

Value

returns a data frame where each row is a gene and sums of RNA-Seq coverage and total number of ribosome profiling reads per gene. If there is no RNA-Seq data coverages are set to NA

Currently PSOCK multithreadding is not supported. For Windows set cores to 1. For operating systems that support forking the number of threads that are used can be set by the cores argument. If the cores>1 then the data is processed by mclapply instead of lapply.

Author(s)

Alper Celik

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (data, genedf, plot = F, cores = 1)
{
    summer <- function(nuc, df) {
        a <- df[df$nucleotide == nuc, ]
        a <- sum(a[["freq"]], na.rm = T)
        a
    }
    genes <- genedf$gene
    get_data <- function(gene_name) {
        gene <- get_gene(gene = gene_name, genedf = genedf, data = data,
            seq = NA)
        if (is.null(gene)) {
            NULL
        }
        else {
            coord <- genedf[genedf$gene == gene_name, ]
            gene <- gene[gene$nucleotide >= 0 & gene$nucleotide <=
                (coord$end - coord$start + 1), ]
            counts <- sapply(unique(gene$nucleotide), summer,
                gene)
            gene <- as.data.frame(cbind(unique(gene[, 1:2]),
                counts))
            if (dim(gene)[1] == 0) {
                NULL
            }
            else {
                dats <- colSums(gene[, 2:3])
                gene <- data.frame(sum_coverage = dats[1], sum_ribo = dats[2])
                rownames(gene) <- gene_name
                gene
            }
        }
    }
    if (cores > 1) {
        totals <- do.call("rbind", mclapply(genes, get_data,
            mc.cores = cores))
        if (is.null(data$rna)) {
            totals$sum_coverage <- NA
        }
    }
    else {
        totals <- do.call("rbind", lapply(genes, get_data))
        if (is.null(data$rna)) {
            totals$sum_coverage <- NA
        }
    }
    totals <- as.data.frame(totals)
    if (plot) {
        print(ggplot(totals, aes(x = sum_coverage, y = sum_ribo)) +
            geom_point())
    }
    invisible(totals)
  }

celalp/ribofootprintR documentation built on May 12, 2019, 12:04 p.m.