writeStatesByFastaEntries: Write states to h5 file

Description Usage Arguments

View source: R/inference.R

Description

writeStatesByFastaEntries Removes layers (optional) from pretrained model and calculates states of fasta file, writes a separate h5 file for every fasta entry in fasta file. h5 files also contain the nucleotide sequence and positions of targets corresponding to states. Names of output files are: file_path + "Nr" + i + filename + file_type, where i is the number of the fasta entry.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
writeStatesByFastaEntries(
  model.path,
  layer.depth = NULL,
  fasta.path,
  round_digits = 2,
  file_name = "states.h5",
  file_path,
  step = 1,
  vocabulary = c("a", "c", "g", "t"),
  batch.size = 256,
  padding = FALSE,
  file_type = "h5",
  model = NULL,
  mode = "lm"
)

Arguments

model.path

Path to a pretrained model.

layer.depth

Depth of layer to evaluate. If NULL last layer is used.

fasta.path

Path to fasta file.

round_digits

Number of decimal places.

file_name

Filename to store states, function adds "Nr" + "i" before name, where i is entry number.

file_path

Path to folder, where to write output.

step

Frequency of sampling steps.

vocabulary

Vector of allowed characters, character outside vocabulary get encoded as 0-vector.

batch.size

Number of samples to evaluate at once. Does not change output, only relevant for speed and memory.

padding

Logical scalar, generate states for first maxlen nucleotides by padding beginning of sequence with 0-vectors.

file_type

Either "h5" or "csv".

model

A keras model. If model and model.path are not NULL, model will be used for inference.

mode

Either "lm" for language model or "label" for label classification.

seqStart

Inserts character at beginning of sequence from one file.

seqEnd

Insert character at end of sequence from one file.

withinFile

Insert characters between fasta entries.


hiddengenome/altum documentation built on April 22, 2020, 9:33 p.m.