Description Usage Arguments Value
fastaLabelGenerator Iterates over folder containing .fasta files and produces one-hot-encoding of predictor sequences 
and target variables. Targets will be read from fasta headers.
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | fastaLabelGenerator(
  corpus.dir,
  format = "fasta",
  batch.size = 256,
  maxlen = 250,
  max_iter = 10000,
  vocabulary = c("a", "c", "g", "t"),
  verbose = FALSE,
  randomFiles = FALSE,
  step = 1,
  showWarnings = FALSE,
  seed = 1234,
  shuffleFastaEntries = FALSE,
  numberOfFiles = NULL,
  fileLog = NULL,
  labelVocabulary = c("x", "y", "z"),
  reverseComplements = TRUE
)
 | 
| corpus.dir | Input directory where .fasta files are located or path to single file ending with .fasta or .fastq (as specified in format argument). | 
| format | File format, either fasta or fastq. | 
| batch.size | Number of batches. | 
| maxlen | Length of predictor sequence. | 
| max_iter | Stop after max_iter number of iterations failed to produce a new batch. | 
| vocabulary | Vector of allowed characters, character outside vocabulary get encoded as 0-vector. | 
| verbose | Whether to show message. | 
| randomFiles | Logical, whether to go through files randomly or sequential. | 
| step | How often to take a sample. | 
| showWarnings | Logical, give warning if character outside vocabulary appears. | 
| seed | Sets seed for set.seed function, for reproducible results when using  | 
| shuffleFastaEntries | Logical, shuffle fasta entries. | 
| numberOfFiles | Use only specified number of files, ignored if greater than number of files in corpus.dir. | 
| fileLog | Write name of files to csv file if path is specified. | 
| labelVocabulary | Character vector of possible targets. Targets outside  | 
| reverseComplements | Logical, half of batch contains sequences and other its reverse complements. Reverse complement 
is given by reversed order of sequence and switching A/T and C/G.  | 
A list of length 2. First element is a 3-dimensional tensor with dimensions (batch.size, maxlen, length(vocabulary)), encoding the predictor sequences. Second element is a matrix with dimensions (batch.size, length(vocabulary)), encoding the targets.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.