TileSeqs: Form a Set of Tiles for Each Group of Sequences.
In DECIPHER: Tools for curating, analyzing, and manipulating biological sequences

Description Usage Arguments Details Value Note Author(s) See Also Examples

Creates a set of tiles that represent each group of sequences in the database for downstream applications.

TileSeqs(dbFile,
         tblName = "Seqs",
         identifier = "",
         minLength = 26,
         maxLength = 27,
         maxTilePermutations = 10,
         minCoverage = 0.9,
         add2tbl = FALSE,
         processors = 1,
         verbose = TRUE,
         ...)

`dbFile`	A SQLite connection object or a character string specifying the path to the database file.
`tblName`	Character string specifying the table of sequences to use for forming tiles.
`identifier`	Optional character string used to narrow the search results to those matching a specific identifier. If "" then all identifiers are selected.
`minLength`	Integer providing the minimum number of nucleotides in each tile. Typically the same or slightly less than `maxLength`.
`maxLength`	Integer providing the maximum number of nucleotides in each tile. Tiles are designed primarily for this length, which should ideally be slightly greater than the maximum length of oligos used in downstream functions.
`maxTilePermutations`	Integer specifying the maximum number of tiles in each target site.
`minCoverage`	Numeric providing the fraction of coverage that is desired for each target site in the group. For example, a `minCoverage` of 0.9 request that additional tiles are added until 90% of the group is represented by the tile permutations.
`add2tbl`	Logical or a character string specifying the table name in which to add the result.
`processors`	The number of processors to use, or `NULL` to automatically detect and use all available processors.
`verbose`	Logical indicating whether to display progress.
`...`	Additional arguments to be passed directly to `SearchDB`.

TileSeqs will create a set of overlapping tiles representing each target site in an alignment of sequences. The most common tile permutations are added until the desired minimum group coverage is obtained. The dbFile is assumed to contain DNAStringSet sequences (any U's are converted to T's).

Target sites with one more more tiles not meeting a set of requirements are marked with misprime equals TRUE. Requirements include minimum group coverage, minimum length, and maximum length. Additionally, tiles are required not to contain more than four runs of a single base or four di-nucleotide repeats.

A data.frame with a row for each tile, and multiple columns of information. The row_names column gives the row number. The start, end, start_aligned, and end_aligned columns provide positioning of the tile in a consensus sequence formed from the group. The column misprime is a logical specifying whether the tile meets the specified constraints. The columns width and id indicate the tile's length and group of origin, respectively.

The coverage field gives the fraction of sequences containing the tile in the group that encompass the tile's start and end positions in the alignment, whereas groupCoverage contains the fraction of all sequences in the group containing a tile at their respective target site. For example, if only a single sequence out of 10 has information (no gap) in the first alignment position, then coverage would be 100% (1.0), while groupCoverage would be 10% (0.1).

The final column, target_site, provides the sequence of the tile.

If add2tbl is TRUE then the tiles will be added to the database table that currently contains the sequences used for tiling. The added tiles may cause interference when querying a table of sequences. Therefore, it is recommended to add the tiles to their own table, for example, by using add2tbl="Tiles".

Erik Wright eswright@pitt.edu

DesignPrimers

1 2	db <- system.file("extdata", "Bacteria_175seqs.sqlite", package="DECIPHER") tiles <- TileSeqs(db, identifier="Pseudomonadales")

Loading required package: Biostrings
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, basename, cbind, colMeans, colSums, colnames,
    dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
    intersect, is.unsorted, lapply, lengths, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
    rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following object is masked from 'package:base':

    expand.grid

Loading required package: IRanges
Loading required package: XVector

Attaching package: 'Biostrings'

The following object is masked from 'package:base':

    strsplit

Loading required package: RSQLite

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |======================================================================| 100%

Time difference of 3.45 secs

DECIPHER documentation built on Nov. 8, 2020, 8:30 p.m.

DECIPHER index

Package overview Classify Sequences Design Group-Specific FISH Probes Design Group-Specific Primers Design Microarray Probes Design Primers That Yield Group-Specific Signatures Finding Chimeric Sequences Getting Started DECIPHERing The Art of Multiple Sequence Alignment in R The Magic of Gene Finding

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

DECIPHER
Tools for curating, analyzing, and manipulating biological sequences

TileSeqs: Form a Set of Tiles for Each Group of Sequences.
In DECIPHER: Tools for curating, analyzing, and manipulating biological sequences

Description

Usage

Arguments

Details

Value

Note

Author(s)

See Also

Examples

Example output

Related to TileSeqs in DECIPHER...

R Package Documentation

Browse R Packages

We want your feedback!

DECIPHER Tools for curating, analyzing, and manipulating biological sequences

TileSeqs: Form a Set of Tiles for Each Group of Sequences. In DECIPHER: Tools for curating, analyzing, and manipulating biological sequences

Description

Usage

Arguments

Details

Value

Note

Author(s)

See Also

Examples

Example output

Related to TileSeqs in DECIPHER...

R Package Documentation

Browse R Packages

We want your feedback!

DECIPHER
Tools for curating, analyzing, and manipulating biological sequences

TileSeqs: Form a Set of Tiles for Each Group of Sequences.
In DECIPHER: Tools for curating, analyzing, and manipulating biological sequences