available_genomes: Available Genomes

available_genomesR Documentation

Available Genomes

Description

Contains metadata about all the genomes available in UCSC. It contains derived metadata, such as the effective genome sizes as well. See also the data-raw/available_genomes.R script to see processing steps.

Usage

available_genomes

Format

An object of class data.frame with 199 rows and 27 columns.

Details

Structure

available_genomes is a data.frame with the following columns:

  • UCSC_orgID

    • Official UCSC ID of the genome

  • description

    • Verbose description of the assembly, source, and year/month of entry.

  • nibPath

    • Endpoint of the genome in UCSC gbdb.

  • organism

    • Name of the organism.

  • defaultPos

    • Default location of genome browser view for this genome.

  • active

    • Description not available.

  • orderKey

    • Description not available.

  • genome

    • The name of the genome.

  • scientificName

    • The scientific name of the organism.

  • htmlPath

    • Path in UCSC gbdb to the description.html file for the genome.

  • hgNearOk

    • Description not available.

  • hgPbOk

    • Description not available.

  • sourceName

    • Name of organization providing the genome.

  • taxId

    • The taxonomy ID of the organism.

  • genes_available

    • If TRUE, the gene annotations are available in GTF format.

  • year

    • The year the genome assembly was added.

  • eff_genome_size_XXbp

    • The effective genome size of this genome. Calculated at various read lengths with khmer and used to improve the accuracy of analysis. See the data-raw/available_genomes.R script to see how this calculation was performed.

  • genome_length

    • The total length of the genome.

  • rlfs_available

    • If TRUE, R-loop forming sequences annotations are available in the RLBase AWS S3 repository.

Examples

available_genomes


Bishop-Laboratory/RLSeq documentation built on Jan. 28, 2023, 11:38 p.m.