classifier: Classifier
In irycisBioinfo/PATO: Pangenome Analysis Toolkit

classifier

R Documentation

Classifier

Description

Classifier take a list of genome files (nucleotide o protein) and identify the most similar specie to each file. Classsifier uses all reference and representative genomes from NCBI Refseq database and search, using mash, the best hit for each genome file in the input list.

Usage

classifier(file_list, n_cores, type = "nucl", max_dist = 0.06)

Arguments

`file_list`	Data frame with the full path to the genome files (gene or protein multi-fasta) or gff_list object.
`n_cores`	Number of cores to use.
`type`	Type of sequence 'nucl' (nucleotides), 'prot' (aminoacids) or 'wgs' for whole genome sequence (only with gff_list objects)
`max_dist`	Maximun distance to report (1-Average Nucleotide Identity). Usually all species have 0.05 distance among all each memebers