Tetramers: A base container class for tetramer analysis

TetramersR Documentation

A base container class for tetramer analysis

Description

R6 base class to hold all of the goodies

Create a new Tetramers container object

Read a FASTA file

tabulate tetramers, compute PCA and select outliers

Retrieve the normalized tabulations as a matrix or table

Retrieve the principal components as a matrix or table

Retrieve the principal component rotational loadings as a matrix or table

Retrieve outlier sequences

load blast results

Run blast on the outlier fasta

Run blast on the outlier fasta or simply load existsing blast ouptut

Run blast on the outlier fasta

Usage

Tetramers

Arguments

x

character, filename for FASTA format sequences

name

character, by default extracted from filename

output_dir

the name of the output directory, if not provided then it is automatically a subdirectory of the input file's directory

parameters

numeric, named vector of sleection parameters

  • window window size in bases, default = 1600

  • window step the distance in bases to advance the window, default = 200

  • width sort of dumb, but tetramers are in 4-bases, default = 4

  • hsp_bit_score_min hits below this are flagged with noHitText, default = 75

pick

numeric, named numeric vector of PC id with number to pick. By default 2 extremes (2 pos, 2 neg) for PC1-8.

select_method

character, by default "tidy". Ignored.

blast_options

character, the blast options

...

further arguments for read_fastafile

verbose

logical, if TRUE output some chatter

form

character, either 'matrix' or 'table' (default)

npc

numeric, the first npc components are returned

filename

character, xml filename

verbose

logical, if TRUE output messages for debugging purposes

Format

An object of class R6ClassGenerator of length 24.

Value

matrix or table of normalized tetramer counts

matrix or table of principle components

matrix or table of principle component roational loadings

a table of [cname, wname, PC, pick, sequence] which is an augmented version of the outlers table

logical, TRUE if successful

Fields

filename

character, the fully qualified FASTA filename

name

character, parsed from the filename by default but settable

out_dir

character, the fully qualified output path, possibly created

parameters

named numeric vector, composed of

  • window window size in bases, default = 1600

  • window step the distance in bases to advance the window, default = 200

  • width sort of dumb, but tetramers are in 4-bases, default = 4

  • hsp_bit_score_min hits below this are flagged with noHitText, default = 75

pick

numeric, named vector of number out outliers to pick per PC - generally 2.

blastOpts

list, a named list of blast options

blast

any, blast results

blastinfo

any, info on the blast application

rawdata

any, raw counts

data

matrix,

fail

list, list of failed contigs (generally too short)

PC

list, prcomp results of which we are interested in "x"

outliers

tibble, selected outliers per PC


BigelowLab/tetramers documentation built on April 3, 2022, 8:22 p.m.