qualityScores_EM: Wrapper function to calculate EM metrics

Description Usage Arguments Value Examples

View source: R/qualityScores_EM.R

Description

Wrapper that reads bam files and provides EM QC-metrics from cross-correlation analysis, peak calling and general metrics like for example the read-length or NRF. In total 22 features are calculated.

qualityScores_EM

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
qualityScores_EM(
  chipName,
  inputName,
  read_length,
  chip.data = NULL,
  input.data = NULL,
  readAlignerType = "bam",
  annotationID,
  mc = 1,
  crossCorrelation_Input = FALSE,
  downSamplingChIP = FALSE,
  writeWig = FALSE,
  savePlotPath = NULL,
  debug = FALSE
)

Arguments

chipName

Character, filename (and optional path) for the ChIP bam file (without the .bam extension)

inputName

Character, filename (and optional path) for the Input control bam file (without the .bam extension)

read_length

Integer, length of the reads

chip.data

Optional, taglist object for ChIP reads as returned by spp or readBamFile() function. If not set (NULL) the data will be read from the BAM file with name specified by "chipName"

input.data

Optional, taglist object for Input control reads as returned by spp or readBamFile() function. If not set (NULL) the data will be read from the BAM file with name specified by "inputName"

readAlignerType

string, bam (default) tagAlign file format are supported

annotationID

Character, indicating the genome assembly

mc

Integer, the number of CPUs for parallelization (default=1)

crossCorrelation_Input

Boolean, calculates cross-correlation and and EM metrics for the input. The default=FALSE as the running time increases and the metrics are not used in quality prediction.

downSamplingChIP

Boolean, to be used to downsample reads within enrichment peaks. This option was used for generating simulated low quality (low enrichment) profiles for testing the prediction models. The default is FALSE and should generally not be used by end users.

writeWig

Boolean, saves smoothed tag density in wig format in working directory for Input and ChIP

savePlotPath,

set if Cross-correlation plot should be saved under "savePlotPath". Default=NULL and plot will be forwarded to stdout

debug

Boolean, to enter debugging mode. Intermediate files are saved in working directory

Value

returnList, contains QCscores_ChIP List of QC-metrics with crosscorrelation values for the ChIP QCscores_Input List of QC-metrics with crosscorrelation values for the Input if "crossCorrelation_Input" parameter was set to TRUE, NULL otherwise QCscores_binding List of QCscores from peak calls TagDensityChip Tag-density profile, smoothed by the Gaussian kernel (for further details see "spp" package) TagDensityInput Tag density-profile, smoothed by the Gaussian kernel (for further details see "spp" package)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
## This command is time intensive to run

## To run this example code the user MUST provide 2 bam files: one for ChIP 
## and one for the input". Here we used ChIP-seq data from ENCODE. Two 
## example files can be downloaded using the following link:
## https://www.encodeproject.org/files/ENCFF000BFX/
## https://www.encodeproject.org/files/ENCFF000BDQ/
## and save them in the working directory (here given in the temporary 
## directory "filepath"

mc=4
## Not run: 

filepath=tempdir()
setwd(filepath)

system("wget 
https://www.encodeproject.org/files/ENCFF000BFX/@download/ENCFF000BFX.bam")
system("wget 
https://www.encodeproject.org/files/ENCFF000BDQ/@download/ENCFF000BDQ.bam")

chipName=file.path(filepath,"ENCFF000BFX")
inputName=file.path(filepath,"ENCFF000BDQ")

CC_Result=qualityScores_EM(chipName=chipName, inputName=inputName, 
read_length=36, mc=mc, annotationID = "hg19")

## End(Not run)

carmencita/ChIC documentation built on April 28, 2021, 7:20 p.m.