docs/documentation/minIon_workflow.md

About

HaplotypR is a program for analysis of Amplicon-Seq genotyping experiments.

The HaplotypR project was developed by Anita Lerch. A paper with more details about the program is available from:

License

HaplotypR is distributed under the GNU General Public License, version 3.

Installation

To install HaplotypR start R and first install ShortRead and dada2 by typing:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(c("ShortRead","dada2"))

Then install devtools by typing

install.packages("devtools")

and install HaplotypR by typing

library(devtools)
devtools::install_github("lerch-a/HaplotypR")

Run HaplotypR on R command line

library("HaplotypR")
library("ShortRead")

Copy example files to a working directory 'outputDir':

# Define output directory 
outputDir <- "exampleHaplotypR"  
# Create output directoy
if(!dir.exists(outputDir))
  dir.create(outputDir, recursive=T)

# Copy example files to working directory
file.copy(from=system.file(package="HaplotypR", "extdata/ex3"), to=".", recursive = T)

# List files example files in output direcoty
dir(file.path("ex3"))

The following files should be listed with the last R command: "marker_file_ex3.txt", "sample_file_ex3.txt" and others.

Run demultiplexing by sample and rename output files

# set input file path
primerFile <- "ex3/marker_file_ex3.txt"
sampleFile <- "ex3/sample_file_ex3.txt"
readsDir <- "ex3/read_dir_ex3"

# create output subdirectory 
outDeplexSample <- file.path(outputDir, "dePlexSample")
dir.create(outDeplexSample, recursive=T)

# rename sample files and merge to a single file per sample if needed
sampleTab <- read.delim(sampleFile, stringsAsFactors=F)
dePlexSample <- mergeMinIONfiles(inDir=readsDir, outDir=outDeplexSample, sampleTab=sampleTab)

# save summary table
write.table(dePlexSample, file.path(outputDir, "demultiplexSampleSummary.txt"), sep="\t", row.names=F)

Run demultiplex by marker and truncate primer sequence

# create output subdirectory 
outDeplexMarker <- file.path(outputDir, "dePlexMarker")
dir.create(outDeplexMarker)

# process each marker
markerTab <- read.delim(primerFile, stringsAsFactors=F)
# shorten primer sequence to same length for demultiplexing
markerTab$Forward <- substr(markerTab$Forward, start=nchar(markerTab$Forward)-20, stop=nchar(markerTab$Forward))
markerTab$Reverse <- substr(markerTab$Reverse, start=nchar(markerTab$Reverse)-20, stop=nchar(markerTab$Reverse))
dePlexMarker <- demultiplexByMarkerMinION(dePlexSample, markerTab, outDeplexMarker, max.mismatch=2)

# save summary table
write.table(dePlexMarker, file.path(outputDir, "demultiplexMarkerSummary.txt"), sep="\t", row.names=F)

Call Haplotypes

# call haplotype options
minCov <- 3
detectionLimit <- 1/100
minOccHap <- 1
minCovSample <- 100

# call final haplotypes
finalTab <- createFinalHaplotypTableDADA2(
  outputDir = outputDir, sampleTable = dePlexMarker, markerTable = markerTab,
  referenceSequence=NULL, filterIndel=T,
  minHaplotypCoverage = minCov, minReplicate = minOccHap, 
  detectability = detectionLimit, minSampleCoverage = minCovSample,
  multithread=FALSE, pool="pseudo", OMEGA_A=1e-120)

write.csv(finalTab, file=file.path(outputDir, "finalHaplotypList_vMinION.csv"), row.names=F)


lerch-a/HaplotypR documentation built on Dec. 22, 2024, 2:19 p.m.