Description Constructor Extracting Genomic Sequence Coercion Accessors Author(s) Examples
The GmapGenome class represents a genome that has been indexed for use
with the GMAP suite of tools. It is typically used as a parameter to
the functions gsnap
and bam_tally
. This class
also provides the means to index new genomes, from either a FASTA file
or a BSgenome
object. Genome indexes are typically stored in a
centralized directory on the file system and are identified by a
string key.
GmapGenome(genome, directory =
GmapGenomeDirectory(create = create), name = genomeName(genome),
create = FALSE, ...)
:
Creates a GmapGenome
corresponding to the genome
argument, which may be either a string identifier of the genome
within directory
, a FastaFile
or
DNAStringSet
of the genome sequence, or
a BSgenome
object.
The genome index is stored in directory
argument, which may
be either a GmapGenomeDirectory
object, or a
string path.
The name
argument is the actual key used for storing the
genome index within directory
. If genome
is a
string, it is taken as the key. If a FastaFile
, it is the
basename of the file without the extension. If a BSgenome
,
it is the providerVersion
. Otherwise, the name
must
be specified. If create
is TRUE
, the genome index is
created if one with that name does not already exist. This
obviously only works if genome
actually contains the genome
sequence.
The first example below gives the typical and recommended usage when implementing a reproducible analysis.
getSeq(x, which = seqinfo(x))
: Extracts the genomic
sequence for each region in which
(something coercible to
GRanges
). The result is a character vector for now. This is
implemented in C and is very efficient. The default for which
will retrieve the entire genome.
as(object, "DNAStringSet")
: Extracts the entire
sequence of the genome as a DNAStringSet
. One consequence is
that this comes possible with rtracklayer:
export(object, "genome.fasta")
.
path(object)
: returns the path to the directory
containing the genome index files.
directory(x)
: returns the GmapGenomeDirectory
that is the parent of the directory containing the index files for
this genome.
genome(x)
: gets the name of this genome.
seqinfo(x)
: gets the Seqinfo
for this genome; only sequence names and lengths are available.
Michael Lawrence
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ## Not run:
library(BSgenome.Dmelanogaster.UCSC.dm3)
flyGG <- GmapGenome(Dmelanogaster, create = TRUE)
## access system-wide genome using a key
flyGG <- GmapGenome(genome = "dm3")
which <- seqinfo(flyGG)["chr4"]
firstchr <- getSeq(flyGG, which)
genome(which) <- "hg19"
## should throw an error
try(getSeq(flyGG, which))
##create a GmapGenome from a FASTA file
fa <- system.file("extdata/hg19.p53.fasta", package="gmapR")
fastaFile <- rtracklayer::FastaFile(fa)
gmapGenome <- GmapGenome(fastaFile, create=TRUE)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.