annotate: Annotate Virulence Factors and Antibiotic Resistance Genes

View source: R/annotate.R

annotateR Documentation

Annotate Virulence Factors and Antibiotic Resistance Genes

Description

This function annotate virulence factors and the antibiotic resitance genes of the genomes (files). The annotation is performed using mmseqs2 software (https://github.com/soedinglab/MMseqs2) and the databases Virulence Factor DataBase (http://www.mgc.ac.cn/cgi-bin/VFs/v5/main.cgi) and ResFinder (https://cge.cbs.dtu.dk/services/ResFinder/). The function can re-use the previous computational steps of mmseqs or create a new index database from the files. Re-use option shorten the computational time. This method use the algorithm search of mmseqs2 so it olny return high identity matchs.

Usage

annotate(
  data,
  type = "nucl",
  database = c("AbR", "VF_A", "VF_B"),
  query = "all"
)

Arguments

data

A mmseq object

type

user must be specified if the data set is nucleotide or protein.

database

A vector with the query databases:

  • AbR: Antibiotic resistance database (ResFinder)

  • VF_A VFDB core dataset (genes associated with experimentally verified VFs only)

  • VF_B VFDB full dataset (all genes related to known and predicted VFs)

  • bacmet BacMet dataset (genes associated with Biocide and Metal resistance)

query

"all" or "accessory". It perform the annotation from whole protein dataset or just from the accessory

Value

A data.frame with the annotation information

  • Genome: Genome query

  • Protein: Proteins query

  • target: Protein subject (AbR o VF)

  • pident: Percentage of identical matches

  • alnlen: Alingment length

  • mismatch: number of mismatchs

  • gapopen: number of gaps

  • qstart: query start alingment

  • qend: query end alingment

  • tstart: target start alingment

  • tend: target end alingment

  • evalue: evalue

  • bits: Bitscore

  • DataBase: Database (AbR, VF_A or VF_B, bacmet)

  • Gene: Gene name

  • Description: Functional annotation (VF) or category (AbR)

Note

Keep in mind that the results from accesory are based on the annotation of the representative protein of the homologous cluster and therefore does not mean that all the genomes have the same allele of the gene.


irycisBioinfo/PATO documentation built on Oct. 19, 2023, 3:07 p.m.