plasmidome: Plasmidome: Extract the plasmid sequences from a set of...

View source: R/plasmidome.R

plasmidomeR Documentation

Plasmidome: Extract the plasmid sequences from a set of genomes

Description

This function uses mlplasmids to find the contigs belong to plasmids. The package mlplasmids must be installed in your computer (see datails). Nowadays mlplasmdis only process Enterococcus faecium, Escherichia coli or Klebsiella pneumoniae genomes

Usage

plasmidome(gff_list, specie)

Arguments

gff_list

A gff_list object (see load_gff_list())

specie

mlplasmid accept as species Enterococcus faecium, Escherichia coli or Klebsiella pneumoniae

Details

The input genomes must be in gff_list format (see load_gff_list() function) Plasmidome find the contigs that belong to plasmids and creates a new GFF structure with all the features information (FAA, FNA, FFN and GFF).

By default PATO does not install mlplasmids because it is deposits in its own gitlab repository (https://gitlab.com/sirarredondo/mlplasmids).

The authors define mlplasmids as:

mlplasmids consists of binary classifiers to predict contigs either as plasmid-derived or chromosome-derived. It currently classifies short-read contigs as chromosomes and plasmids for Enterococcus faecium, Escherichia coli and Klebsiella pneumoniae. Further species will be added in the future. The provided classifiers are Support Vector Machines which were previously optimized and selected versus other existing machine-learning techniques (Logistic regression, Random Forest..., see also Reference). The classifiers use pentamer frequencies (n = 1,024) to infer whether a particular contig is plasmid- or chromosome- derived.

To install mlplasmids you must have installed devtools (usually you have because PATO requires that you have devtools installed too). Then type:

devtools::install_git("https://gitlab.com/sirarredondo/mlplasmids")

References

Arredondo-Alonso et al., Microbial Genomics 2018;4 DOI 10.1099/mgen.0.000224


irycisBioinfo/PATO documentation built on Oct. 19, 2023, 3:07 p.m.