Gene predictions using Prodigal

Share:

Description

Finds coding genes in a genome, using the Prodigal software, and outputs them as a FASTA file.

Usage

1
2
prodigalPredict(genome.file, prot.file, nuc.file = NULL, closed.ends = TRUE,
  motif.scan = FALSE)

Arguments

genome.file

Name of a FASTA formatted file with all the DNA sequences for a genome (chromosomes, plasmids, contigs etc.).

prot.file

Name of output file. Predicted protein sequences will be written to this file, in a FASTA format.

nuc.file

If specified, nucleotide version of each protein is written to this file (default NULL).

closed.ends

Logical, if TRUE genes are not allowed to run off edges (default TRUE).

motif.scan

Logical, if TRUE forces motif scan instead of Shine-Dalgarno trainer (default FALSE).

Details

This function sets up a call to the software Prodigal (Hyatt et al, 2009). This software is designed to find coding genes in prokaryote genomes. It runs fast and has obtained very good results in tests among the automated gene finders. The options used as default here are believed to be the best for pan-genomic analyses.

Value

The call to Prodigal produces a FASTA formatted file with predicted protein sequences, and if nuc.file is specified, a similar file with nucleotide sequences. See readFasta for how to read such files into R.

Note

The Prodigal software must be installed on the system for this function to work, i.e. the command system("prodigal") (no version numbers!) must be recognized as a valid command if you run it in the Console window.

Author(s)

Lars Snipen and Kristian Hovde Liland.

References

Hyatt, D., Chen, G., LoCascio, P.F., Land, M.L., Larimer, F.W., Hauser, L.J. (2009). Prodigal: prokaryotic gene recognition and translation initiation site identification, BMC Bioinformatics, 11:119.

See Also

entrezDownload.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## Not run: 
# Using a small genome file in this package
# We need to uncompress it first...
extdata.path <- file.path(path.package("micropan"),"extdata")
filenames <- "Mpneumoniae_309_genome.fsa"
pth <- lapply( file.path( extdata.path, paste( filenames, ".xz", sep="" ) ), xzuncompress )

# Calling Prodigal, and using a similar name (_genome replaced by _protein) in output
prodigalPredict( file.path(extdata.path,filenames), gsub("_genome","_protein",filenames) )

# ...and compressing the genome-file again...
pth <- lapply( file.path( extdata.path, filenames ), xzcompress )

## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.