screening: Screening genes markers

View source: R/screening.R

screeningR Documentation

Screening genes markers

Description

This function annotate virulence factors, antibiotic resitance genes and/or biocide and metals of the genomes (files). The annotation is performed using mmseqs2 software for protein sequences (https://github.com/soedinglab/MMseqs2) or Minimap2 (https://github.com/lh3/minimap2) for nucleotide (wgs or nucl).

Usage

screening(
  data,
  type = "nucl",
  database = c("AbR", "VF_A", "VF_B"),
  query = "all",
  n_cores
)

Arguments

data

A mmseq object

type

user must be specified if the data set is nucleotide or protein.

database

A vector with the query databases:

  • AbR: Antibiotic resistance database (ResFinder)

  • VF_A VFDB core dataset (genes associated with experimentally verified VFs only)

  • VF_B VFDB full dataset (all genes related to known and predicted VFs)

  • bacmet BacMet dataset (genes associated with Biocide and Metal resistance)

query

"all" or "accessory". It perform the annotation from whole protein dataset or just from the accessory

Details

Databases

  • Virulence Factor DataBase (Set_A and Set_B) (http://www.mgc.ac.cn/cgi-bin/VFs/v5/main.cgi)

  • ResFinder (https://cge.cbs.dtu.dk/services/ResFinder/).

  • BacMet (http://bacmet.biomedicine.gu.se/).

The function can re-use the previous computational steps of mmseqs or create a new index database from the files. Re-use option shorten the computational time. This method use the algorithm search of mmseqs2 so it olny return high identity matchs.

Value

A data.frame with the annotation information

  • Genome: Genome query

  • Protein: Proteins query

  • target: Protein subject (AbR o VF)

  • pident: Percentage of identical matches

  • alnlen: Alingment length

  • mismatch: number of mismatchs

  • gapopen: number of gaps

  • qstart: query start alingment

  • qend: query end alingment

  • tstart: target start alingment

  • tend: target end alingment

  • evalue: evalue

  • bits: Bitscore

  • DataBase: Database (AbR, VF_A or VF_B, bacmet)

  • Gene: Gene name

  • Description: Functional annotation (VF) or category (AbR)

Note

Keep in mind that the results from accesory are based on the annotation of the representative protein of the homologous cluster and therefore does not mean that all the genomes have the same allele of the gene.


irycisBioinfo/PATO documentation built on Oct. 19, 2023, 3:07 p.m.