statesByFastaOneFile: Write states to h5 file

Description Usage Arguments

View source: R/inference.R

Description

writeStatesByFastaEntries Removes layers (optional) from pretrained model and calculates states of fasta file, writes separate states matrix in one .h5 file for every fasta entry. h5 file also contains the nucleotide sequences and positions of targets corresponding to states.

To acces the content of h5 file: h5_path <- "/path/to/file" h5_file <- hdf5r::H5File$new(h5_path, mode = "r") a <- h5_file[["states"]] # shows header names # names(a) b <- a[["someHeaderName"]] #shows state matrix #b[,] h5_file$close_all()

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
statesByFastaOneFile(
  model.path,
  layer.depth = NULL,
  fasta.path,
  round_digits = 2,
  h5.filename = "states.h5",
  step = 1,
  vocabulary = c("a", "c", "g", "t"),
  batch.size = 256,
  padding = FALSE,
  verbose = TRUE,
  model = NULL,
  mode = "lm"
)

Arguments

model.path

Path to a pretrained model.

layer.depth

Depth of layer to evaluate. If NULL last layer is used.

fasta.path

Path to fasta file.

round_digits

Number of decimal places.

h5.filename

Filename of h5 file to store states.

step

Frequency of sampling steps.

vocabulary

Vector of allowed characters, character outside vocabulary get encoded as 0-vector.

batch.size

Number of samples to evaluate at once. Does not change output, only relevant for speed and memory.

padding

Logical scalar, generate states for first maxlen nucleotides by padding beginning of sequence with 0-vectors.

verbose

Whether to print model before and after removing layers.

model

A keras model. If model and model.path are not NULL, model will be used for inference.

mode

Either "lm" for language model or "label" for label classification.


hiddengenome/altum documentation built on April 22, 2020, 9:33 p.m.