get_gene: Get coverage and ribosome occupancy of a gene by nucleotide

Description Usage Arguments Value Author(s) See Also Examples

Description

Returns a data.frame with RNA-Seq coverage per nucleotide and This is workhorse function in the package, it is called by most other functions.

Usage

1
get_gene(gene_name, data, plot = F, seq = NA, genedf)

Arguments

gene_name

character name of the gene, must match genedf$gene entry

data

list returned by read_profiling_data() function

plot

whether to plot the data frame, default=F

seq

DNAStringset object of the FASTA file used for alignment. This is only needed for the occupancy function otherwise can be set to NA. Defaults to NA

genedf

a data frame that contains the gene name, and the nucleotide coordinates of start and stop locations as well as number of nucleotides and codon of the ORF.

Value

A data.frame with the nucleotide number (negative if within the 5' UTR), RNA-Seq coverage (NA is no RNA-Seq library is present) and the number of reads that map to each codon (and their frame). See read_profiling_data for the frame calculations

Author(s)

Alper Celik

See Also

read_profiling_data

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (gene_name, data, plot = F, seq = NA, genedf)
{
    ribo <- data[["ribo"]][[gene_name]]
    rna <- data[["rna"]][[gene_name]]
    if (is.null(ribo) == T) {
        gene <- NULL
    }
    else {
        coord <- genedf[genedf$gene == gene_name, ]
        coord <- coord[coord[, 1] == gene_name, ]
        if (is.null(rna) == T) {
            gene <- ribo
            gene$frame <- as.factor(gene$frame)
            if (plot) {
                plot <- ggplot() + geom_bar(data = gene, aes(x = nucleotide,
                  y = freq, fill = frame), stat = "identity",
                  position = "dodge")
                plot <- plot + geom_segment(aes(x = 0, y = 0,
                  xend = (coord$end - coord$start) + 1, yend = 0),
                  color = "blueviolet", size = 2)
                print(plot)
            }
        }
        else {
            gene <- dplyr::full_join(rna, ribo, by = "nucleotide")
            gene$frame <- as.factor(gene$frame)
            if (plot) {
                plot <- ggplot() + geom_line(data = gene, aes(x = nucleotide,
                  y = -coverage)) + geom_bar(data = gene[!is.na(gene$frame),
                  ], aes(x = nucleotide, y = freq, fill = frame),
                  stat = "identity", position = "dodge")
                plot <- plot + geom_segment(aes(x = 0, y = 0,
                  xend = coord$end - coord$start, yend = 0),
                  color = "blueviolet", size = 2)
                print(plot)
            }
        }
    }
    invisible(gene)
  }

celalp/ribofootprintR documentation built on May 12, 2019, 12:04 p.m.