signal_at_orf: Signal at all ORFs genome-wide (meta ORF)

Description Usage Arguments Value Examples

Description

This function allows you to pull out the ChIP signal over all ORFs in the genome. It collects the signal over each ORF plus both flanking regions (1/2 the length of the ORF on each side) and scales them all to the same value (1000). This means that for two example genes with lengths of 500 bp and 2 kb, flanking regions of 250 bp and 1 kb, respectively, will be collected up and downstream. The whole region is then rescaled to a length of 1000, corresponding to a gene length of 500 plus 250 for each flanking region. After scaling, a loess model of the signal is built and used to output predictions of the signal at each position between 1 and 1000.
The function takes as input the wiggle data as a list of 16 chromosomes. (output of readall_tab).

Note: Our wiggle data always contains gaps with missing chromosome coordinates and ChIP-seq signal. The way this function deals with that is by skipping affected genes. The number of skipped genes in each chromosome is printed to the console, as well as the final count (and percentage) of skipped genes.

Usage

1
signal_at_orf(inputData, gff, gffFile, loessSpan = 0.05, saveFile = FALSE)

Arguments

inputData

As a list of the 16 chr wiggle data (output of readall_tab). No default.

gff

Optional dataframe of the gff providing the ORF cordinates. Must be provided if gffFile is not. No default. Note: You can use the function gff_read in hwglabr to load your selected gff file.

gffFile

Optional string indicating path to the gff file providing the ORF cordinates. Must be provided if gff is not. No default.

loessSpan

Number specifying span argument for loess function (the smoothing parameter alpha). This controls the degree of smoothing of the signal. Defaults to 0.05.

saveFile

Boolean indicating whether output should be written to a .txt file (in current working directory). If saveFile = FALSE, output is returned to screen or an R object (if assigned). Defaults to FALSE.

Value

A local data frame with four columns:

  1. chr Chromosome number

  2. position Nucleotide coordinate (in normalized total length of 1 kb)

  3. signal ChIP-seq signal at each position (1 to 1000)

  4. gene Systematic gene name

Examples

1
2
3
4
5
6
7
## Not run: 
signal_at_orf(WT, gff = gff)

signal_at_orf(WT, gffFile = S288C_annotation_modified.gff,
              loessSpan = 0.1, saveFile = TRUE)

## End(Not run)

luisvalesilva/hwglabr documentation built on May 21, 2019, 8:56 a.m.