readTASSELGBSv2: Import Read Depth and Alignment from TASSEL GBS v2

View source: R/data_import.R

readTASSELGBSv2R Documentation

Import Read Depth and Alignment from TASSEL GBS v2

Description

This function reads TagTaxaDist and SAM files output by the TASSEL 5 GBS v2 pipeline, and generates a RADdata object suitable for downstream processing for genotype estimation. It elimintes the need to run the DiscoverySNPCallerPluginV2 or the ProductionSNPCallerPluginV2, since polyRAD operates on haplotypes rather than SNPs.

Usage

readTASSELGBSv2(tagtaxadistFile, samFile, min.ind.with.reads = 200, 
                min.ind.with.minor.allele = 10, possiblePloidies = list(2),
                taxaPloidy = 2L, contamRate = 0.001, chromosomes = NULL)

Arguments

tagtaxadistFile

File name or path to a tab-delimited text file of read depth generated by the GetTagTaxaDistFromDBPlugin in TASSEL.

samFile

File name or path to the corresponding SAM file containing alignment information for the same set of tags. This file is obtained by running the TagExportToFastqPlugin in TASSEL, followed by alignment using Bowtie2 or BWA.

min.ind.with.reads

Integer used for marker filtering. The minimum number of individuals that must have read depth above zero for a locus to be retained in the output.

min.ind.with.minor.allele

Integer used for marker filtering. The minimum number of individuals possessing reads for the minor allele for a locus to be retained in the output. This value is also passed to the min.ind.with.haplotype argument of MergeRareHaplotypes.

possiblePloidies

A list indicating inheritance modes that might be encountered in the dataset. See RADdata.

taxaPloidy

A single integer, or an integer vector with one value per taxon, indicating ploidy. See RADdata.

contamRate

A number indicating the expected sample cross-contamination rate. See RADdata.

chromosomes

A character vector of chromosome names, indicating chromosomes to be retained in the output. If NULL, all chromosomes to be retained. This argument is intended to be used for reading data in a chromosome-wise fashion in order to conserve computer memory.

Value

A RADdata object containing read depth and alignment infomation from the two input files.

Note

Sequence tags must be identical in length to be assigned to the same locus by this function. This is to prevent errors with MergeRareHaplotypes.

Author(s)

Lindsay V. Clark

References

TASSEL GBSv2 pipeline: https://bitbucket.org/tasseladmin/tassel-5-source/wiki/Tassel5GBSv2Pipeline

Bowtie2: https://bowtie-bio.sourceforge.net/bowtie2/index.shtml

BWA: https://bio-bwa.sourceforge.net/

See Also

Other data import functions: readStacks, readHMC, readTagDigger, VCF2RADdata, readDArTag

Examples

# get files for this example
samfile <- system.file("extdata", "exampleTASSEL_SAM.txt",
                       package = "polyRAD")
ttdfile <- system.file("extdata", "example_TagTaxaDist.txt",
                       package = "polyRAD")

# import data
myrad <- readTASSELGBSv2(ttdfile, samfile, min.ind.with.reads = 8,
                         min.ind.with.minor.allele = 2)

lvclark/polyRAD documentation built on Jan. 15, 2024, 4:19 a.m.