Description Usage Arguments Details Author(s) Examples
A set of functions for making a BSgenome data package.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | ## Top-level BSgenomeForge function:
forgeBSgenomeDataPkg(x, seqs_srcdir=".", destdir=".", verbose=TRUE)
## Low-level BSgenomeForge functions:
forgeSeqlengthsRdsFile(seqnames, prefix="", suffix=".fa",
seqs_srcdir=".", seqs_destdir=".",
genome=NA_character_, verbose=TRUE)
forgeSeqlengthsRdaFile(seqnames, prefix="", suffix=".fa",
seqs_srcdir=".", seqs_destdir=".",
genome=NA_character_, verbose=TRUE)
forgeSeqFiles(provider, genome,
seqnames, mseqnames=NULL,
seqfile_name=NA, prefix="", suffix=".fa",
seqs_srcdir=".", seqs_destdir=".",
ondisk_seq_format=c("2bit", "rds", "rda", "fa.rz", "fa"),
verbose=TRUE)
forgeMasksFiles(seqnames, nmask_per_seq,
seqs_destdir=".",
ondisk_seq_format=c("2bit", "rda", "fa.rz", "fa"),
masks_srcdir=".", masks_destdir=".",
AGAPSfiles_type="gap", AGAPSfiles_name=NA,
AGAPSfiles_prefix="", AGAPSfiles_suffix="_gap.txt",
RMfiles_name=NA, RMfiles_prefix="", RMfiles_suffix=".fa.out",
TRFfiles_name=NA, TRFfiles_prefix="", TRFfiles_suffix=".bed",
verbose=TRUE)
|
x |
A BSgenomeDataPkgSeed object or the name of a BSgenome data package seed file. See the BSgenomeForge vignette in this package for more information. |
seqs_srcdir, masks_srcdir |
Single strings indicating the path to the source directories i.e. to the directories containing the source data files. Only read access to these directories is needed. See the BSgenomeForge vignette in this package for more information. |
destdir |
A single string indicating the path to the directory where the source tree of the target package should be created. This directory must already exist. See the BSgenomeForge vignette in this package for more information. |
verbose |
|
provider |
The provider of the sequence data files e.g.
|
genome |
The name of the genome. Typically the name of an NCBI assembly (e.g.
|
seqnames, mseqnames |
A character vector containing the names of the single (for |
seqfile_name, prefix, suffix |
See the BSgenomeForge vignette in this package for more information,
in particular the description of the |
seqs_destdir, masks_destdir |
During the forging process the source data files are converted into
serialized Biostrings objects. Both |
ondisk_seq_format |
Specifies how the single sequences should be stored in the forged package.
Can be |
nmask_per_seq |
A single integer indicating the desired number of masks per sequence. See the BSgenomeForge vignette in this package for more information. |
AGAPSfiles_type, AGAPSfiles_name, AGAPSfiles_prefix, AGAPSfiles_suffix,
RMfiles_name, RMfiles_prefix, RMfiles_suffix,
TRFfiles_name, TRFfiles_prefix, TRFfiles_suffix |
These arguments are named accordingly to the corresponding fields of a BSgenome data package seed file. See the BSgenomeForge vignette in this package for more information. |
These functions are intended for Bioconductor users who want to make a new
BSgenome data package, not for regular users of these packages.
See the BSgenomeForge vignette in this package
(vignette("BSgenomeForge")
) for an extensive coverage
of this topic.
H. Pag<c3><a8>s
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | seqs_srcdir <- system.file("extdata", package="BSgenome")
seqnames <- c("chrX", "chrM")
## Forge .2bit sequence files:
forgeSeqFiles("UCSC", "ce2",
seqnames, prefix="ce2", suffix=".fa.gz",
seqs_srcdir=seqs_srcdir,
seqs_destdir=tempdir(), ondisk_seq_format="2bit")
## Forge .rds sequence files:
forgeSeqFiles("UCSC", "ce2",
seqnames, prefix="ce2", suffix=".fa.gz",
seqs_srcdir=seqs_srcdir,
seqs_destdir=tempdir(), ondisk_seq_format="rds")
## Sanity checks:
library(BSgenome.Celegans.UCSC.ce2)
genome <- BSgenome.Celegans.UCSC.ce2
ce2_sequences <- import(file.path(tempdir(), "single_sequences.2bit"))
ce2_sequences0 <- DNAStringSet(list(chrX=genome$chrX, chrM=genome$chrM))
stopifnot(identical(names(ce2_sequences0), names(ce2_sequences)),
all(ce2_sequences0 == ce2_sequences))
chrX <- readRDS(file.path(tempdir(), "chrX.rds"))
stopifnot(genome$chrX == chrX)
chrM <- readRDS(file.path(tempdir(), "chrM.rds"))
stopifnot(genome$chrM == chrM)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.