| gdb.create | R Documentation |
Creates a new Genomic Database.
gdb.create(
groot = NULL,
fasta = NULL,
genes.file = NULL,
annots.file = NULL,
annots.names = NULL,
format = NULL,
verbose = FALSE
)
groot |
path to newly created database |
fasta |
an array of names or URLs of FASTA files. Can contain wildcards for multiple files |
genes.file |
name or URL of file that contains genes. If 'NULL' no genes are imported |
annots.file |
name of URL file that contains annotations. If 'NULL' no annotations are imported |
annots.names |
annotations names |
format |
database format: "indexed" (default, single genome.seq + genome.idx)
or "per-chromosome" (separate .seq file per contig). If NULL, uses the value from
|
verbose |
if TRUE, prints verbose messages |
This function creates a new Genomic Database at the location specified by 'groot'. FASTA files are converted to 'Seq' format and appropriate 'chrom_sizes.txt' file is generated (see "User Manual" for more details).
Two database formats are supported:
indexed: Single genome.seq + genome.idx (default). Recommended for genomes with many contigs. Provides better performance and scalability.
per-chromosome: Separate .seq file per contig.
If 'genes.file' is not 'NULL' four sets of intervals are created in the
database: tss, exons, utr3 and utr5. See
gintervals.import_genes for more details about importing genes
intervals.
'fasta', 'genes.file' and 'annots.file' can be either a file path or URL in a form of 'ftp://[address]/[file]'. 'fasta' can also contain wildcards to indicate multiple files. Files that these arguments point to can be zipped or unzipped.
See the 'Genomes' vignette for details on how to create a database from common genome sources.
None.
gdb.init, gdb.reload,
gintervals.import_genes
# ftp <- "ftp://hgdownload.soe.ucsc.edu/goldenPath/mm10"
# mm10_dir <- file.path(tempdir(), "mm10")
# # only a single chromosome is loaded in this example
# # see "Genomes" vignette how to download all of them and how
# # to download other genomes
# gdb.create(
# mm10_dir,
# paste(ftp, "chromosomes", paste0(
# "chr", c("X"),
# ".fa.gz"
# ), sep = "/"),
# paste(ftp, "database/knownGene.txt.gz", sep = "/"),
# paste(ftp, "database/kgXref.txt.gz", sep = "/"),
# c(
# "kgID", "mRNA", "spID", "spDisplayID", "geneSymbol",
# "refseq", "protAcc", "description", "rfamAcc",
# "tRnaName"
# )
# )
# gdb.init(mm10_dir)
# gintervals.ls()
# gintervals.all()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.