An introduction to Rbowtie

Introduction

The r Biocpkg("Rbowtie") package provides an R wrapper around the popular bowtie [@bowtie] short read aligner and around SpliceMap [@SpliceMap] a de novo splice junction discovery and alignment tool, which makes use of the bowtie software package.

The package is used by the r Biocpkg("QuasR") [@QuasR] bioconductor package to _qu_antify and _a_nnotate _s_hort _r_eads. We recommend to use the r Biocpkg("QuasR") package instead of using r Biocpkg("Rbowtie") directly. The r Biocpkg("QuasR") package provides a simpler interface than r Biocpkg("Rbowtie") and covers the whole analysis workflow of typical ultra-high throughput sequencing experiments, starting from the raw sequence reads, over pre-processing and alignment, up to quantification.

Preliminaries

Citing Rbowtie

If you use r Biocpkg("Rbowtie") [@Rbowtie] in your work, you can cite it as follows:

citation("Rbowtie")

Installation

r Biocpkg("Rbowtie") is a package for the R computing environment and it is assumed that you have already installed R. See the R project at (http://www.r-project.org). To install the latest version of r Biocpkg("Rbowtie"), you will need to be using the latest version of R. r Biocpkg("Rbowtie") is part of the Bioconductor project at (http://www.bioconductor.org). To get r Biocpkg("Rbowtie") together with its dependencies you can use

if (!require("BiocManager"))
    install.packages("BiocManager")
BiocManager::install("Rbowtie")

Loading of Rbowtie

In order to run the code examples in this vignette, the r Biocpkg("Rbowtie") library need to be loaded.

library(Rbowtie)

How to get help

Most questions about r Biocpkg("Rbowtie") will hopefully be answered by the documentation or references. If you've run into a question which isn't addressed by the documentation, or you've found a conflict between the documentation and software itself, then there is an active support community which can offer help.

The authors of the package (maintainer: r maintainer("Rbowtie")) always appreciate receiving reports of bugs in the package functions or in the documentation. The same goes for well-considered suggestions for improvements.

Any other questions or problems concerning r Biocpkg("Rbowtie") should be posted to the Bioconductor support site (https://support.bioconductor.org). Users posting to the support site for the first time should read the helpful posting guide at (https://support.bioconductor.org/info/faq/). Note that each function in r Biocpkg("Rbowtie") has it's own help page, e.g. help("bowtie"). Posting etiquette requires that you read the relevant help page carefully before posting a problem to the site.

Example usage for individual Rbowtie functions

Please refer to the r Biocpkg("Rbowtie") reference manual or the function documentation (e.g. using ?bowtie) for a complete description of r Biocpkg("Rbowtie") functions. The descriptions provided below are meant to give and overview over all functions and summarize the purpose of each one.

Build the reference index with bowtie_build{#bowtieBuild}

To be able to align short reads to a genome, an index has to be build first using the function bowtie_build. Information about arguments can be found with the help of the bowtie_build_usage function or in the manual page ?bowtie_build.

bowtie_build_usage()

refFiles below is a vector with filenames of the reference sequence in FASTA format, and indexDir specifies an output directory for the index files that will be generated when calling bowtie_build:

refFiles <- dir(system.file(package="Rbowtie", "samples", "refs"), full=TRUE)
indexDir <- file.path(tempdir(), "refsIndex")

tmp <- bowtie_build(references=refFiles, outdir=indexDir, prefix="index", force=TRUE)
head(tmp)

Create alignment with bowtie

Information about the arguments supported by the bowtie function can be obtained with the help of the bowtie_usage function or in the manual page ?bowtie.

bowtie_usage()

In the example below, readsFiles is the name of a file containing short reads to be aligned with bowtie, and samFiles specifies the name of the output file with the generated alignments.

readsFiles <- system.file(package="Rbowtie", "samples", "reads", "reads.fastq")
samFiles <- file.path(tempdir(), "alignments.sam")

bowtie(sequences=readsFiles, 
       index=file.path(indexDir, "index"), 
       outfile=samFiles, sam=TRUE,
       best=TRUE, force=TRUE)
strtrim(readLines(samFiles), 65)

Create spliced alignment with SpliceMap

While bowtie only generates ungapped alignments, the SpliceMap function can be used to generate spliced alignments. SpliceMap is itself using bowtie. To use it, it is necessary to create an index of the reference sequence as described in \@ref(bowtieBuild). SpliceMap parameters are specified in the form of a named list, which follows closely the configure file format of the original SpliceMap program[@SpliceMap]. Be aware that SpliceMap can only be used for reads that are at least 50bp long.

readsFiles <- system.file(package="Rbowtie", "samples", "reads", "reads.fastq")
refDir <- system.file(package="Rbowtie", "samples", "refs", "chr1.fa")
indexDir <- file.path(tempdir(), "refsIndex")
samFiles <- file.path(tempdir(), "splicedAlignments.sam")

cfg <- list(genome_dir=refDir,
            reads_list1=readsFiles,
            read_format="FASTQ",
            quality_format="phred-33",
            outfile=samFiles,
            temp_path=tempdir(),
            max_intron=400000,
            min_intron=20000,
            max_multi_hit=10,
            seed_mismatch=1,
            read_mismatch=2,
            num_chromosome_together=2,
            bowtie_base_dir=file.path(indexDir, "index"),
            num_threads=4,
            try_hard="yes",
            selectSingleHit=TRUE)
res <- SpliceMap(cfg)
res
strtrim(readLines(samFiles), 65)

Session information

The output in this vignette was produced under:

sessionInfo()

References



Try the Rbowtie package in your browser

Any scripts or data that you put into this service are public.

Rbowtie documentation built on Nov. 8, 2020, 6:11 p.m.