SCRATsummary: SCRATsummary

Description Usage Arguments Details Author(s) Examples

View source: R/SCRATsummary.R

Description

Compile SCRAT summary table

Usage

1
2
3
4
5
6
7
SCRATsummary(dir = "", genome, bamfile = NULL, singlepair = "automated",
  removeblacklist = T, log2transform = T, featurelist = c("GENE", "ENCL",
  "MOTIF_TRANSFAC", "MOTIF_JASPAR", "GSEA"), Genestarttype = "TSSup",
  Geneendtype = "TSSdown", Genestartbp = 3000, Geneendbp = 1000,
  ENCLclunum = 2000, Motifflank = 100, GSEAterm = "c5.bp",
  GSEAstarttype = "TSSup", GSEAendtype = "TSSdown", GSEAstartbp = 3000,
  GSEAendbp = 1000)

Arguments

dir

The folder where the bam files are stored. If bamfile is NULL, all bam files within the folder will be analyzed.

genome

The mapped genome of the bam files. Should be one the following: "hg19", "hg38", "mm9", "mm10"

bamfile

A character vector of bam files. If NULL, all files in dir will be included.

singlepair

Whether the original sequencing files are single-end or paired-end. Should be one of the following: "automated", "single", "pair". Default is "automated" where SCRAT will automatically determine the type.

removeblacklist

Logical value indicating whether black list regions should be removed.

log2transform

Logical value indicating whether the read counts should be log2 transformed (after adding pseudo-count of 1).

featurelist

A character vector specifying what kind of features should be considered. Should be from the following: "GENE","ENCL","MOTIF_TRANSFAC","MOTIF_JASPAR","GSEA". By default all features are included. Note that "GSEA" features could be slow to run.

Genestarttype

For "GENE" features, type of starting site. Should be one of the following: "TSSup", "TSSdown", "TESup", "TESdown". The four options stands for TSS upstream, TSS downstream, TES upstream and TES downstream.

Geneendtype

For "GENE" features, type of ending site. Options same as Genestarttype

Genestartbp

For "GENE" features, how many base pairs away from starting TSS/TES. For example, Genestarttype="TSSup" and Genestartbp=500 means 500 bp upstream of TSS.

Geneendbp

For "GENE" features, how many base pairs away from ending TSS/TES.

ENCLclunum

Number of clusters for ENCL features. Should be one of 1000, 2000 and 5000

Motifflank

Defines the size of flanking region of motif sites in base pairs.

GSEAterm

The GSEA terms included in the analysis. Only useful when "GSEA" is included in featurelist. Should be one of the following: "h.all","c1.all","c2.cgp","c2.cp","c3.mir","c3.tft","c4.cgn","c4.cm","c5.bp","c5.cc","c5.mf","c6.all","c7.all".

GSEAstarttype

For "GSEA" features, type of starting site.

GSEAendtype

For "GSEA" features, type of ending site.

GSEAstartbp

For "GSEA" features, how many base pairs away from starting TSS/TES.

GSEAendbp

For "GSEA" features, how many base pairs away from ending TSS/TES.

Details

This function will compile a SCRAT summary table from bam files. The results should be the same as run on GUI.

Author(s)

Zhicheng Ji, Weiqiang Zhou, Hongkai Ji <zji4@zji4.edu>

Examples

1
2
3
4
## Not run: 
   SCRATsummary(dir="bamfiledir",genome="hg19")

## End(Not run)

zji90/SCRAT documentation built on April 7, 2020, 2:08 p.m.