make_raw_nt: Make matrix of raw bytes from nucleotide sequences

View source: R/primers.R

make_raw_ntR Documentation

Make matrix of raw bytes from nucleotide sequences

Description

Each sequence in the given vector becomes a column of the output matrix, with a row for each position. Shorter sequences are padded with a specific value at bottom of the matrix. With the defaults (see RAW_NT), A, C, G, and T (case insensitive) are encoded as 01, 02, 03, and 04, IUPAC codes are bitwise combinations of those values, padding values are 0x80, and any other character is 0x00.

Usage

make_raw_nt(seqs, map = RAW_NT, pad = 128, other = 0, chunksize = 8000)

Arguments

seqs

character vector of nucleotide sequences

map

raw vector with nucleotide names and byte values

pad

raw value to use for missing positions

other

raw value to use for nucleotides not in map

chunksize

integer number of sequences to process at a time. For large (hundreds of thousands on up) numbers of input sequences this function can use a lot of memory, but limiting the number of sequences processed at once limits memory usage.

Value

raw matrix with positions on rows and sequences on columns


ShawHahnLab/microsat documentation built on Aug. 25, 2023, 11:16 p.m.