classifier: Classifier

View source: R/classifier.R

classifierR Documentation

Classifier

Description

Classifier take a list of genome files (nucleotide o protein) and identify the most similar specie to each file. Classsifier uses all reference and representative genomes from NCBI Refseq database and search, using mash, the best hit for each genome file in the input list.

Usage

classifier(file_list, n_cores, type = "nucl", max_dist = 0.06)

Arguments

file_list

Data frame with the full path to the genome files (gene or protein multi-fasta) or gff_list object.

n_cores

Number of cores to use.

type

Type of sequence 'nucl' (nucleotides), 'prot' (aminoacids) or 'wgs' for whole genome sequence (only with gff_list objects)

max_dist

Maximun distance to report (1-Average Nucleotide Identity). Usually all species have 0.05 distance among all each memebers

Value

Classifier returns a data.frame with the best hit for each input genome.


irycisBioinfo/PATO documentation built on Oct. 19, 2023, 3:07 p.m.