The GmapGenome class represents a genome that has been indexed for use
with the GMAP suite of tools. It is typically used as a parameter to
bam_tally. This class
also provides the means to index new genomes, from either a FASTA file
BSgenome object. Genome indexes are typically stored in a
centralized directory on the file system and are identified by a
GmapGenome(genome, directory = GmapGenomeDirectory(create = create), name = genomeName(genome), create = FALSE, ...):
GmapGenomecorresponding to the
genomeargument, which may be either a string identifier of the genome within
DNAStringSetof the genome sequence, or a
The genome index is stored in
directoryargument, which may be either a
GmapGenomeDirectoryobject, or a string path.
nameargument is the actual key used for storing the genome index within
genomeis a string, it is taken as the key. If a
FastaFile, it is the basename of the file without the extension. If a
BSgenome, it is the
providerVersion. Otherwise, the
namemust be specified. If
TRUE, the genome index is created if one with that name does not already exist. This obviously only works if
genomeactually contains the genome sequence.
The first example below gives the typical and recommended usage when implementing a reproducible analysis.
Extracting Genomic Sequence
getSeq(x, which = seqinfo(x)): Extracts the genomic sequence for each region in
which(something coercible to
GRanges). The result is a character vector for now. This is implemented in C and is very efficient. The default for
whichwill retrieve the entire genome.
as(object, "DNAStringSet"): Extracts the entire sequence of the genome as a
DNAStringSet. One consequence is that this comes possible with rtracklayer:
path(object): returns the path to the directory containing the genome index files.
directory(x): returns the
GmapGenomeDirectorythat is the parent of the directory containing the index files for this genome.
genome(x): gets the name of this genome.
seqinfo(x): gets the
Seqinfofor this genome; only sequence names and lengths are available.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
## Not run: library(BSgenome.Dmelanogaster.UCSC.dm3) flyGG <- GmapGenome(Dmelanogaster, create = TRUE) ## access system-wide genome using a key flyGG <- GmapGenome(genome = "dm3") which <- seqinfo(flyGG)["chr4"] firstchr <- getSeq(flyGG, which) genome(which) <- "hg19" ## should throw an error try(getSeq(flyGG, which)) ##create a GmapGenome from a FASTA file fa <- system.file("extdata/hg19.p53.fasta", package="gmapR") fastaFile <- rtracklayer::FastaFile(fa) gmapGenome <- GmapGenome(fastaFile, create=TRUE) ## End(Not run)
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.