Construct a pangenome from fasta files

Share:

Description

This function constructs an initial pangenome object from a set of fasta files. Note that the actual pangenome is not calculated here. As such this function mainly sets everything up before beginning the more lengthly pangenome calculation.

Usage

1

Arguments

paths

A character vector with location of fasta files

translated

A boolean indicating if the fasta files contain amino acid sequences

geneLocation

A function, string or dataframe. If it is a data.frame it should contain the columns 'contig', 'start', 'end' and 'strand' with a row for each gene. If it is a function it should take the name (fasta description) for each gene and output a data.frame similar to described above. If it is a string it should specify the format of the gene names. Currently only 'prodigal' is supported.

lowMem

Boolean. Should FindMyFriends avoid storing sequences in memory.

...

Additional defaults to set on the object

Value

A pgVirtual subclass object depending on geneLocation and lowMem.

geneLocation lowMem Resulting class
NULL FALSE pgFull
NULL TRUE pgLM
!NULL FALSE pgFullLoc
!NULL TRUE pgLMLoc

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
location <- tempdir()
unzip(system.file('extdata', 'Mycoplasma.zip', package='FindMyFriends'),
      exdir=location)
genomeFiles <- list.files(location, full.names=TRUE, pattern='*.fasta')

# Create pgFull
pangenome(genomeFiles, TRUE)

# Create pgFullLoc
pangenome(genomeFiles, TRUE, geneLocation='prodigal')

# Create pgLM
pangenome(genomeFiles, TRUE, lowMem=TRUE)

# Create pgLMLoc
pangenome(genomeFiles, TRUE, geneLocation='prodigal', lowMem=TRUE)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.