run_tsv2bam: Run STACKS tsv2bam and merges BAM files
In thierrygosselin/stackr: Run stacks pipeline for RADseq analysis inside R

run_tsv2bam

R Documentation

Run STACKS tsv2bam and merges BAM files

Description

Runs STACKS tsv2bam module and additionnally, this function will also generate a summary of stacks tsv2bam and will merge in parallel BAM sample files into a unique BAM catalog file using SAMtools or Sambamba. tsv2bam converts the data (single-end or paired-end) from being organized by sample into being organized by locus. This allows downstream improvements (e.g. Bayesian SNP calling).

Usage

run_tsv2bam(
  P = "06_ustacks_2_gstacks",
  M = "02_project_info/population.map.tsv2bam.tsv",
  R = NULL,
  parallel.core = parallel::detectCores() - 1,
  cmd.path = "/usr/local/bin/samtools",
  h = FALSE
)

Arguments

`P`	(path, character) Path to the directory containing STACKS files. Default: `P = "06_ustacks_2_gstacks"`. Inside the folder, you should have: the catalog files: starting with `batch_` and ending with `.alleles.tsv.gz, .snps.tsv.gz, .tags.tsv.gz`; 3 files for each samples: The sample name is the prefix for the files ending with: `.alleles.tsv.gz, .snps.tsv.gz, .tags.tsv.gz`. Those files are created in the ustacks, sstacks and cxstacks modules.
`M`	(character, path) Path to a population map file. Note that the `-s` option is not used inside stackr. Default: `M = "02_project_info/population.map.tsv2bam.tsv"`.
`R`	(path, character) Directory where to find the paired-end reads files (in fastq/fasta/bam (gz) format).
`parallel.core`	(integer) Enable parallel execution with the number of threads. Default: `parallel.core = parallel::detectCores() - 1`
`cmd.path`	(character, path) Provide the FULL path to SAMtools program. See details on how to install SAMtools. Default: `cmd.path = "/usr/local/bin/samtools"`.
`h`	Display this help messsage. Default: `h = FALSE`

Details

Install SAMtools link to detailed instructions on how to install SAMtools

Value

tsv2bam returns a set of .matches.bam files.

The function run_tsv2bam returns a list with the number of individuals, the batch ID number, a summary data frame and a plot containing:

INDIVIDUALS: the sample id
ALL_LOCUS: the total number of locus for the individual (shown in subplot A)
LOCUS: the number of locus with a one-to-one relationship (shown in subplot B) with the catalog
MATCH_PERCENT: the percentage of locus with a one-to-one relationship with the catalog (shown in subplot C)

Addtionally, the function returns a catalog.bam file, generated by merging all the individual BAM files in parallel.

References

Catchen JM, Amores A, Hohenlohe PA et al. (2011) Stacks: Building and Genotyping Loci De Novo From Short-Read Sequences. G3, 1, 171-182.

Catchen JM, Hohenlohe PA, Bassham S, Amores A, Cresko WA (2013) Stacks: an analysis tool set for population genomics. Molecular Ecology, 22, 3124-3140.

Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R. and 1000 Genome Project Data Processing Subgroup (2009) The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics, 25, 2078-9.

Li H A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011 Nov 1;27(21):2987-93.

A. Tarasov, A. J. Vilella, E. Cuppen, I. J. Nijman, and P. Prins. Sambamba: fast processing of NGS alignment formats. Bioinformatics, 2015.

Examples

## Not run: 
# The simplest form of the function:
bam.sum <- stackr::run_tsv2bam() # that's it !

## End(Not run)

thierrygosselin/stackr documentation built on April 13, 2025, 10:28 a.m.

thierrygosselin/stackr index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

thierrygosselin/stackr
Run stacks pipeline for RADseq analysis inside R

run_tsv2bam: Run STACKS tsv2bam and merges BAM files
In thierrygosselin/stackr: Run stacks pipeline for RADseq analysis inside R

Run STACKS tsv2bam and merges BAM files

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to run_tsv2bam in thierrygosselin/stackr...

R Package Documentation

Browse R Packages

We want your feedback!

thierrygosselin/stackr Run stacks pipeline for RADseq analysis inside R

run_tsv2bam: Run STACKS tsv2bam and merges BAM files In thierrygosselin/stackr: Run stacks pipeline for RADseq analysis inside R

Run STACKS tsv2bam and merges BAM files

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to run_tsv2bam in thierrygosselin/stackr...

R Package Documentation

Browse R Packages

We want your feedback!

thierrygosselin/stackr
Run stacks pipeline for RADseq analysis inside R

run_tsv2bam: Run STACKS tsv2bam and merges BAM files
In thierrygosselin/stackr: Run stacks pipeline for RADseq analysis inside R