run_GATK_JointGenotype: 'Run GATK Joint Genotype on farm'

Description Usage Arguments Details Value Examples

View source: R/run_GATK_JointGenotype.R

Description

GATK Best Practices: recommended workflows for variant discovery analysis.

Usage

1
2
3
4
5
6
7
8
9
run_GATK_JointGenotype(gvcf, outvcf,
  ref.fa = "~/dbcenter/Ecoli/reference/Ecoli_k12_MG1655.fasta",
  gatkpwd = "$HOME/bin/GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar",
  includeNonVariantSites = FALSE, hardfilter = TRUE,
  snpflt = "\"QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0\"",
  indelflt = "\"QD < 2.0 || FS > 200.0 || ReadPosRankSum < -20.0\"",
  email = NULL, runinfo = c(FALSE, "bigmemh", 1))

set_hardfilter(outvcf, gatkpwd, snpflt, indelflt, ref.fa, runinfo, shid)

Arguments

gvcf

A vector of g.vcf files.

outvcf

File name of the output vcf files, default="mysamples.vcf".

ref.fa

The full path of genome with bwa indexed reference fasta file.

gatkpwd

The absolute path of GenomeAnalysisTK.jar.

includeNonVariantSites

Include loci found to be non-variant after genotyping (for GenotypeGVCFs).

hardfilter

Whether to filter variants. see detail about how to apply hard filters to a call set. https://www.broadinstitute.org/gatk/guide/article?id=2806

snpflt

Parameters to apply the filter to the SNP call set.

indelflt

Parameters to apply the filter to the Indel call set.

email

Your email address that farm will email to once the job was done/failed.

runinfo

Parameters specify the array job partition information. A vector of c(FALSE, "bigmemh", "1"): 1) run or not, default=FALSE 2) -p partition name, default=bigmemh and 3) –cpus, default=1. It will pass to set_array_job.

Details

see more detail about GATK: https://www.broadinstitute.org/gatk/guide/bp_step.php?p=1

local programs: bwa Version: 0.7.5a-r405 picard-tools-2.1.1 GenomeAnalysisTK-3.5/

Value

return a batch of shell scripts.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
gvcf <- c("1.vcf", "2.vcf")
outvcf <- "out.vcf"
run_GATK_JointGenotype(
gvcf,
outvcf="mysamples.vcf",
ref.fa="~/dbcenter/Ecoli/reference/Ecoli_k12_MG1655.fasta",
gatkpwd="$HOME/bin/GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar",
includeNonVariantSites=FALSE,
hardfilter=TRUE,
snpflt="\"QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0\"",
indelflt="\"QD < 2.0 || FS > 200.0 || ReadPosRankSum < -20.0\"",
email=NULL,
runinfo = c(FALSE, "bigmemh", 1) )

yangjl/farmeR documentation built on May 4, 2019, 2:28 p.m.