run.prodigal: Finding coding genes

View source: R/extern.R

run.prodigalR Documentation

Finding coding genes

Description

Finding coding genes in genomic DNA using the Prodigal software.

Usage

run.prodigal(
  genome_file = system.file("extdata/examples/2619619645/in", "2619619645.genes.fna",
    package = "microtrait", mustWork = TRUE),
  fa_file = gsub(".fna", ".prodigal.fa", genome_file),
  faa_file = gsub(".fna", ".prodigal.faa", genome_file),
  mode = "single",
  transtab = 11,
  maskN = FALSE,
  bypassSD = FALSE
)

Arguments

genome.file

A FASTA file with the genome sequence(s).

faa.file

If provided, prodigal will output all proteins to this fasta-file (text).

proc

Either "single" or "meta", see below.

mask.N

Turn on masking of N's (logical)

bypass.SD

Bypass Shine-Dalgarno filter (logical)

Details

The external software Prodigal is used to scan through a prokaryotic genome to detect the protein coding genes. This free software can be installed from https://github.com/hyattpd/Prodigal.

In addition to the standard output from this function, FASTA files with protein and/or DNA sequences may be produced directly by providing filenames in faa.file and ffn.file.

The input proc allows you to specify if the input data should be treated as a single genome (default) or as a metagenome.

The translation table is by default 11 (the standard code), but table 4 should be used for Mycoplasma etc.

The mask.N will prevent genes having runs of N inside. The bypass.SD turn off the search for a Shine-Dalgarno motif.

Value

prodigal outputs

Note

The prodigal software must be installed on the system for this function to work, i.e. the command ‘⁠system("prodigal -h")⁠’ must be recognized as a valid command if you run it in the Console window.


ukaraoz/microtrait documentation built on March 18, 2024, 5:47 p.m.