report_haplotypes: Analyse AmpSeq experiments and generate HTML reports

Description Usage Arguments Examples

View source: R/report_haplotypes.R

Description

report_haplotypes performs analysis of an Ampseq experiment using the HaplotypR package. A summary of the results is presented in an HTML report with downloadable data embeded.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
report_haplotypes(run_name, sample_table, marker_table, barcodes_fwd,
  barcodes_rev, reads_fwd, reads_rev, marker_panel = NULL,
  novel_hap_name = "novel", out_dir = NULL, overwrite = FALSE,
  snp_min_coverage = 100, snp_min_mismatch_rate = 0.5,
  snp_min_obs = 2, hap_sample_min_coverage = 300,
  hap_min_coverage = 3, hap_min_freq_detection = 1/100,
  parallel = "auto", max_parallel = 4, min_merge_overlap = 20,
  trim_fwd = "auto", trim_rev = "auto", trim_target_qual = 35,
  trim_target_qual_percent = 75, trim_target_len_percent = 95,
  read_join_strategy = c("auto", "bind", "merge"),
  nread_split_target = 5e+05, nread_gather_target = 50000,
  max_proc_reads = 5000, enable_forking = TRUE,
  browse_report = FALSE, marker_read_target = 1000)

Arguments

run_name

A character scalar. Name of the AmpSeq run, included in report title.

sample_table

A data.frame with sample information. Must contain columns SampleID, BarcodeID_F, BarcodeID_R, SampleName.

marker_table

A data.frame with marker information. Must contain columns MarkerID, Forward, Reverse, ReferenceSequence.

barcodes_fwd

Path to fasta file containing forward barcodes. Sequence names must match sample_table$BarcodeID_F.

barcodes_rev

Path to fasta file containing reverse barcodes. Sequence names must match sample_table$BarcodeID_R.

reads_fwd

Path to fastq(.gz) file containg foward reads.

reads_rev

path to fastq(.gz) file containg reverse reads.

marker_panel

A data.frame with a panel of marker sequences (optional). Must contain columns MarkerID, Haplotype, Sequence.

out_dir

Path to the directory to output results.

overwrite

A logical scalar. Indicates whether output directory should be overwritten if it already exists.

snp_min_coverage

An integer scalar. Minimum read coverage required to identify a SNP in a sample.

snp_min_mismatch_rate

A numeric scalar. Minimum fraction of reads supporting the alternate allele required to identify as SNP in a sample. Must be between 0 and 1.

snp_min_obs

An integer scalar. Minimum number of times a SNP must be observed to be included in Haplotypes.

hap_sample_min_coverage

An integer scalar. Minimum sample marker coverage for calling Haplotypes. Passed to HaplotypR::createFinalHaplotypTable as minSampleCoverage.

hap_min_coverage

An integer scalar. Minimum haplotype coverage for calling Haplotypes. Passed to HaplotypR::createFinalHaplotypTable as minHaplotypCoverage.

hap_min_freq_detection

A numeric scalar. Minimum haplotype frequency in a sample to be reported. Must be between 0 and 1. Passed to HaplotypR::createFinalHaplotypTable as detectability.

parallel

An integer scalar or 'auto'. Number of processes to run in parallel using the future and furrr packages. When set to 'auto' the number of cores available will be detected.

max_parallel

An integer scalar. Maximum numer of processes to run in parallel.

min_merge_overlap

An integer scalar. Minimum number of overlapping base pairs to attempt paired read merging with HaplotypR::mergeAmpliconReads.

trim_fwd

An integer scalar or 'auto'. Number of base pairs to trim forward reads to when using HaplotypR::bindAmpliconReads. When set to 'auto' this will be chosen based on read quality profiles.

trim_rev

An integer scalar or 'auto'. See trim_fwd.

trim_target_qual

An integer scalar. Parameters controlling automatic read trimming.

trim_target_qual_percent

An integer scalar. Parameters controlling automatic read trimming.

trim_target_len_percent

An integer scalar. Parameters controlling automatic read trimming.

read_join_strategy

A character scalar. When set to 'auto' reads will be 'merged' if the is sufficient overlap, otherwise they will be 'bound'. When set to 'bind' reads are joined with HaplotypR::bindAmpliconReads. When set to 'merge' reads are joined with HaplotypR::mergeAmpliconReads.

nread_split_target

An integer scalar. Parameters controlling splitting input reads for parallelism.

nread_gather_target

An integer scalar. Parameters controlling splitting input reads for parallelism.

max_proc_reads

An integer scalar. Maximum number of reads per sample to use for genotyping. Lower values will improve runtime, higher values will improve sensitivity.

enable_forking

A logical scalar. Enables parallelism by forking (faster but less stable).

browse_report

A logical scalar. Display report in browser after run is complete.

marker_read_target

An integer scalar. Target number of reads per sample marker, only affects plot in report.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## Not run: 
example_data <- get_haplotypr_example_data()

report_haplotypes(run_name = 'Example',
                  sample_table = example_data$sample_table,
                  marker_table = example_data$marker_table,
                  barcodes_fwd = example_data$barcodes_fwd,
                  barcodes_rev = example_data$barcodes_rev,
                  reads_fwd = example_data$reads_fwd,
                  reads_rev = example_data$reads_rev,
                  read_join_strategy = 'bind')

## End(Not run)

bahlolab/HaplotypReportR documentation built on Dec. 2, 2019, 7:36 p.m.