Home

/

GitHub

/

yezhengSTAT/permseq_0.3.0

/

priorHistone_init: Process histone ChIP-seq dataset to build the prior without...

priorHistone_init: Process histone ChIP-seq dataset to build the prior without...
In yezhengSTAT/permseq_0.3.0: Mapping protein-DNA interactions in highly repetitive regions of the genomes with prior-enhanced read mapping

Description Usage Arguments Details Value Author(s) Examples

View source: R/priorHistone_init.R

If no DNase-seq dataset available, process Histone data and select the histone ChIP-seq dataset who shows the strongest relationship with ChIP-seq data. The selected histone ChIP-seq data will be used as DNase-seq data in the model.

priorHistone_init(histoneFile = NULL, histoneName = NULL, chipFile = NULL, fragL = 200,
AllocThres = 900, chrList, capping = 0, outfileLoc = "./", 
bowtieDir, bowtieIndex, vBowtie = 2, mBowtie = 99,
pBowtie = 8, bwaDir, bwaIndex, nBWA = 2, oBWA = 1, tBWA = 8, mBWA = 99,
csemDir, picardDir, saveFiles = TRUE)

`histoneFile`	Input histone ChIP-seq file in fastq format. For faster results, sam formatted file after alignment including multi-mapping reads or bam or bed files already obtained by CSEM with allocated reads can also be processed, if available. Otherwise, it is better start from the fastq formatted file. Default value is NULL.
`histoneName`	Name of the histone ChIP-seq data. If multiple Histone data are used, it needs to be formed into a vector.
`chipFile`	Input ChIP-seq files, in fastq format or sam format to save time if it is already aligned and includes multi-mapping reads. The default value will be NULL.
`fragL`	Average fragment length. The default value is 200.
`AllocThres`	Allocation threshold. It will select reads with scores higher than `AllocThres` (allocation probability*1000). Default set at 900.
`chrList`	A vector of chromosomes that will be included in the analysis. Default set as NULL and `priorHistone_init` will get the list from processed files. Otherwise, if given by the user, it should be consistent with the chromosome name(s) in the corresponding fasta file(s). For more information, see details.
`capping`	Maximum number of reads allowed to start at each nucleotide position. To avoid potential PCR amplification artifacts, the maximum number of reads that can start at a nucleotide position is capped. Default is 0 (no capping, i.e. no maximum restriction).
`outfileLoc`	Directory to store processed files. Default is set to "./".
`bowtieDir`	Directory where Bowtie was installed. Default will be NULL.
`bowtieIndex`	Bowtie index, used in bowtie alignment. Users can select the aligner, Bowtie or BWA, by providing the corresponding index. Default will be NULL.
`vBowtie`	Bowtie parameter. In -v mode, alignments may have no more than vBowtie mismatches, where `v` may be a number from 0 through 3 set using the -v option. Default value is 2.
`mBowtie`	Bowtie parameter. -m parameter instructs bowtie to refrain from reporting any alignments for reads having more than `mBowtie` reportable alignments. Default value is 99 allowing all kinds of multi-reads alignment.
`pBowtie`	Bowtie parameter. The -p option causes Bowtie to launch a specified number of parallel search threads. Each thread runs on a different processor/core and all threads find alignments in paralle. Default value is 8.
`bwaDir`	Directory where BWA was installed. Default set as NULL.
`bwaIndex`	BWA index used in BWA alignment. Users can specify the aligner, Bowtie or BWA, by specifying the index that will be used. Default set as NULL.
`nBWA`	BWA paramter. In "bwa aln -n" mode, if it is an integer, it denotes the maximum edit distances including mismatch and gap open. Otherwise, it will be the fraction of missing alignments given 2% uniform base errr rate. Default value is 2.
`oBWA`	BWA parameter. In "bwa aln -o" mode, it specifies the maximum number of gap open. Default set as 1.
`tBWA`	BWA parameter. In "bwa aln -t" mode, it is the number of threads in multi-threading mode. Default set as 8.
`mBWA`	BWA parameter. In "bwa samse -n", it restricts the maximum number of alignments to output for each read. If a read has more hits, the XA tag will not be written. Default set as 99.
`csemDir`	Directory where CSEM was installed. The default value is NULL.
`picardDir`	Directory where PICARD jar file is saved.
`saveFiles`	Option to save intermediate files created. Default set as TRUE.

Process histone ChIP-seq files and generate module for further analysis in
priorHistone_multi.

This function processes the histone ChIP-seq files and generates a marginal plot (marginal_plot.pdf) stored in outfileLoc to decide which histone data should be used as DNase-seq data (choose the histone ChIP-seq data which has the most increasing relationship with ChIP data).

If no chrList is provided, priorHistone_init will generate the list from processed files (.sam file if Histone input file is in fastq format or .bed file if Histone input file is in .bam or .bed format). Otherwise, if given by the user, it will accelerate the procedure, but the chrList should be consistent with the chromosome name(s) in the corresponding .fa or .fasta file(s). In other words, for example, it should be the name on the first line after ">" in .fa file.

Users can select from Bowtie and BWA to do the alignment by providing the corresponding index and leaving the other as default value NULL. If both indices are provided, the package will automatically use Bowtie to do the multi-mapping reads alignment.

Aligned sam file will go through filtering process to remove duplicates. By default, 'samtools rmdup -s' will be used to carry out such function. If user provides the PICARD jar path, PICARD will be used.

plot(), summary(), names() and print() methods can be used to see the information contained in "Prior" object. To obtain the ChIP-seq (or Histone) alignment information from bowtie, use summary().

A new "Prior" object is created.

First, for each histone ChIP-seq dataset within dnaseHistone

`dnaseKnots`	Knots for B-spline functions. They are the 90, 99 and 99.9th percentiles of read counts.
`dnaseThres`	A vector of DNase-seq group created to generate aggregated ChIP data. After alignment, positions which have the same DNase-seq read count are clustered into one group. `dnaseThres` is the corresponding read count number in each group. Each count is corresponding to one group and the grouping data partitions the whole genome into multiple segments.
`posLoc_bychr`	Location of the files containing the group index of each segment of the genome.
`posLoc3_bychr`	Location of the files containing which segments of the genome are in which group based on the trinary Histone positions according to 90 and 99th percentiles of read counts.

Other elements of the "Prior" include:

`chipName`	Name of ChIP-seq dataset(s).
`chipNum`	Number of ChIP-seq dataset(s).
`chipAlign`	ChIP-seq alignment summary information from bowtie.
`chipSAM`	Location of aligned ChIP-seq in SAM format.
`chipUni`	Location of the aligned ChIP-seq uni-reads files in BED format.
`histoneName`	Name of histone ChIP-seq dataset(s). If no giving values, histoneName is set as a vector of index number(1:length(histoneFile)).
`histoneNum`	Number of Histone dataset(s).
`histoneAlign`	Histone alignment summary information from bowtie.
`chrList`	Chromosome list.
`fragL`	Fragment length, given as a parameter.
`bowtieInfo`	List of bowtie information used: bowtieIndex, bowtieDir, vBowtie, mBowtie and pBowtie.
`bwaInfo`	List of BWA related information: bwaDir, bwaIndex, nBWA, oBWA, tBWA, mBWA.
`csemDir`	Directory of CSEM given as a parameter.
`picardDir`	Directory where PICARD jar file is saved.
`chrom.ref`	Name of the file for chromosome info, given as a parameter.
`outfileLoc`	Directory where processed files are (given as an argument).

Xin Zeng, M. Constanza Rojo-Alfaro, Ye Zheng

## Not run: 
object = priorHistone_init(histoneFile = NULL, histoneName = NULL,
  chipFile, fragL, AllocThres = 900, chrList, capping = 0, outfileLoc = "./",
  bowtieDir,  bowtieIndex,  vBowtie = 2,
  mBowtie = 99, pBowtie = 8, csemDir, picardDir, saveFiles = TRUE)

## End(Not run)

yezhengSTAT/permseq_0.3.0 documentation built on May 24, 2019, 2:07 a.m.

yezhengSTAT/permseq_0.3.0 index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

yezhengSTAT/permseq_0.3.0
Mapping protein-DNA interactions in highly repetitive regions of the genomes with prior-enhanced read mapping

priorHistone_init: Process histone ChIP-seq dataset to build the prior without...
In yezhengSTAT/permseq_0.3.0: Mapping protein-DNA interactions in highly repetitive regions of the genomes with prior-enhanced read mapping

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Related to priorHistone_init in yezhengSTAT/permseq_0.3.0...

R Package Documentation

Browse R Packages

We want your feedback!

yezhengSTAT/permseq_0.3.0 Mapping protein-DNA interactions in highly repetitive regions of the genomes with prior-enhanced read mapping

priorHistone_init: Process histone ChIP-seq dataset to build the prior without... In yezhengSTAT/permseq_0.3.0: Mapping protein-DNA interactions in highly repetitive regions of the genomes with prior-enhanced read mapping

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Related to priorHistone_init in yezhengSTAT/permseq_0.3.0...

R Package Documentation

Browse R Packages

We want your feedback!

yezhengSTAT/permseq_0.3.0
Mapping protein-DNA interactions in highly repetitive regions of the genomes with prior-enhanced read mapping

priorHistone_init: Process histone ChIP-seq dataset to build the prior without...
In yezhengSTAT/permseq_0.3.0: Mapping protein-DNA interactions in highly repetitive regions of the genomes with prior-enhanced read mapping