runa_bsmap: 'Run bsmap on farm'

Description Usage Arguments Details Value Examples

Description

BSMAP pipeline to count C and CT bases across the genome. In the pipeline, first, we will align adapter trimmed reads against reference genome. Then, we will sort the bam file (bam file will be kept at the end) and remove PCR duplicates. After marking duplicates, we will use bamtools to keep properly paired reads. And then use bamUtil to clip overlapping reads. And finally, extract DNA methylation ratio at each cysotine site.

Usage

1
2
3
4
runa_bsmap(inputdf,
  ref.fa = "~/dbcenter/Ecoli/reference/Ecoli_k12_MG1655.fasta",
  picardpwd = "$HOME/bin/picard-tools-2.1.1/picard.jar", email = NULL,
  runinfo = c(FALSE, "bigmemh", 1))

Arguments

inputdf

An input data.frame for fastq files. Must contains fq1, fq2, out (and/or bam). If inputdf contained bam, bwa alignment will be escaped. Additional columns: group (group id), sample (sample id), PL (platform, i.e. illumina), LB (library id), PU (unit, i.e. unit1). These strings (or info) will pass to BWA mem through -R.

ref.fa

The full path of genome with bwa indexed reference fasta file.

picardpwd

The absolute path of picard.jar.

email

Your email address that farm will email to once the jobs were done/failed.

runinfo

Parameters specify the array job partition information. A vector of c(FALSE, "bigmemh", "1"): 1) run or not, default=FALSE 2) -p partition name, default=bigmemh and 3) –cpus, default=1. It will pass to set_array_job.

Details

see more detail about BSMAP for Methylation: https://sites.google.com/a/brown.edu/bioinformatics-in-biomed/bsmap-for-methylation

dependency: bsmap-2.90 Note: the following parameters were used in bsmap "-v 5 -r 0 -q 20 -A AGATCGGAAGAGCGGTTCAGCAGGAATGCCG". bamUtil: https://github.com/statgen/bamUtil module load bamtools/2.2.3 module load java/1.8

Value

return a batch of shell scripts.

Examples

1
2
3
4
5
6
7
inputdf <- data.frame(fq1="$HOME/dbcenter/Ecoli/fastq/SRR2921970.sra_1.fastq.gz",
                      fq2="$HOME/dbcenter/Ecoli/fastq/SRR2921970.sra_2.fastq.gz",
                      out="$HOME/dbcenter/Ecoli/fastq/SRR2921970")

runa_bsmap(inputdf, ref.fa="~/dbcenter/Ecoli/reference/Ecoli_k12_MG1655.fasta",
picardpwd="$HOME/bin/picard-tools-2.1.1/picard.jar",
email=NULL, runinfo = c(FALSE, "bigmemh", 1))

yangjl/maizeR documentation built on May 4, 2019, 2:28 p.m.