Description Constructor Extracting Genomic Sequence Coercion Accessors Author(s) Examples
The GmapGenome class represents a genome that has been indexed for use
with the GMAP suite of tools. It is typically used as a parameter to
the functions gsnap and bam_tally. This class
also provides the means to index new genomes, from either a FASTA file
or a BSgenome object. Genome indexes are typically stored in a
centralized directory on the file system and are identified by a
string key.
GmapGenome(genome, directory =
GmapGenomeDirectory(create = create), name = genomeName(genome),
create = FALSE, ...):
Creates a GmapGenome corresponding to the genome
argument, which may be either a string identifier of the genome
within directory, a FastaFile or
DNAStringSet of the genome sequence, or
a BSgenome object.
The genome index is stored in directory argument, which may
be either a GmapGenomeDirectory object, or a
string path.
The name argument is the actual key used for storing the
genome index within directory. If genome is a
string, it is taken as the key. If a FastaFile, it is the
basename of the file without the extension. If a BSgenome,
it is the providerVersion. Otherwise, the name must
be specified. If create is TRUE, the genome index is
created if one with that name does not already exist. This
obviously only works if genome actually contains the genome
sequence.
The first example below gives the typical and recommended usage when implementing a reproducible analysis.
getSeq(x, which = seqinfo(x)): Extracts the genomic
sequence for each region in which (something coercible to
GRanges). The result is a character vector for now. This is
implemented in C and is very efficient. The default for which
will retrieve the entire genome.
as(object, "DNAStringSet"): Extracts the entire
sequence of the genome as a DNAStringSet. One consequence is
that this comes possible with rtracklayer:
export(object, "genome.fasta").
path(object): returns the path to the directory
containing the genome index files.
directory(x): returns the GmapGenomeDirectory
that is the parent of the directory containing the index files for
this genome.
genome(x): gets the name of this genome.
seqinfo(x): gets the Seqinfo
for this genome; only sequence names and lengths are available.
Michael Lawrence
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ## Not run:
library(BSgenome.Dmelanogaster.UCSC.dm3)
flyGG <- GmapGenome(Dmelanogaster, create = TRUE)
## access system-wide genome using a key
flyGG <- GmapGenome(genome = "dm3")
which <- seqinfo(flyGG)["chr4"]
firstchr <- getSeq(flyGG, which)
genome(which) <- "hg19"
## should throw an error
try(getSeq(flyGG, which))
##create a GmapGenome from a FASTA file
fa <- system.file("extdata/hg19.p53.fasta", package="gmapR")
fastaFile <- rtracklayer::FastaFile(fa)
gmapGenome <- GmapGenome(fastaFile, create=TRUE)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.