BayesANT: Main function to call the BayesANT (BAYESiAn Nonparametric...

View source: R/BayesANT.R

BayesANTR Documentation

Main function to call the BayesANT (BAYESiAn Nonparametric Taxonomic) classifier. The function construct an object of class 'BayesANT', which can be used for predictions.

Description

Main function to call the BayesANT (BAYESiAn Nonparametric Taxonomic) classifier. The function construct an object of class 'BayesANT', which can be used for predictions.

Usage

BayesANT(
  data,
  typeseq = "aligned",
  type_location = "single",
  kmers = 5,
  newtaxa = TRUE,
  usegap = FALSE,
  save_nucl = FALSE,
  verbose = TRUE
)

Arguments

data

An object of class c('data.frame', 'BayesANT.data') containing the training library. Needs to be loaded with the function read.BayesANT.data.

typeseq

Format of sequences used to train the classifier. Options are 'not aligned' and 'aligned'. Default is typeseq = 'aligned'. In this last case, the function returns an error if the sequences are not of the same length.

type_location

How to model the loci in an aligned set of DNA sequences. Valid only if typeseq = 'aligned'. Options are 'single', which refers to the single-location multinomial kernel, and 'pairs', which is the 2-mer multinomial kernel. Default is type_location = 'single'

kmers

Length of a substring under a k-mer decomposition. Valid only if typeseq = 'not aligned'. Default is kmers = 5, which is also the recommended choice. The maximum choice allowed is kmers = 8

newtaxa

Whether to account for new taxa when constructing the classifier. Default is newtaxa = TRUE. If newtaxa = FALSE,no potential unobserved branches are included in the taxonomy.

usegap

Whether to include the alignment gap "-" among the nucleotides in the aligned multinomial kernel. Default is usegap = FALSE

save_nucl

Save the nucleotide counts and the hyperparameters of the model in a list. Default is save_nucl = FALSE. Setting it to TRUE might be heavy to store, so use only if strictly needed.

verbose

Monitor the steps adopted to train the algorithm. Default is verbose = TRUE.

Value

An object of class BayesANT. We return a list containing the following quantities:

  • dataDataset used for training.,

  • data_missingDataset containing sequences with missing annotations.

  • missing_taxaIndeces of the sequences with missing values.

  • typeseqType of sequences used to train the classifier.

  • type_locationType of location to train the classifier.

  • newtaxaWhether new taxa are included in the classification.

  • level_namesNames of taxonomic ranks

  • nuclNucleotides detected.

  • kmersNumber of kmers selected to build the classifier.

  • ParameterMatrixMatrix that stores model parameters.

  • Nucl_countsList of counts of the nucleotides at every leaf.

  • PYparsEstimated Pitman-yor parameters.

  • PriorprobsPrior probabilities selected for the model.

  • hyperparametersList of model hyperparameters.

  • leavesName of the taxonomic leaves.

  • sequences_lengthLength of each sequence in the data.


alessandrozito/BayesANT documentation built on April 5, 2025, 6:22 a.m.