knitr::opts_chunk$set(echo = TRUE)

Cheat Sheet for adding Resources to the Hub

This doc describes how to add resources to AnnotationHub or ExperimentHub. In general, these instructions pertain to core team members only.


AnnotationHub


Run Periodically - whenever updated

ensembl release

This requires generating two types of resources

  1. GTF -> GRanges on the fly conversion
  2. 2Bit

GTF

On Local Machine:

  1. Navigate to the AnnotationHub_docker directory or create such a directory by following the instructions here.

  2. Start the docker:

export MYSQL_REMOTE_PASSWORD=***  (See credentials doc)
docker-compose up
  1. In a new terminal start R:
options(AH_SERVER_POST_URL="http://localhost:3000/resource")
options(ANNOTATION_HUB_URL="http://localhost:3000")
url <- getOption("AH_SERVER_POST_URL")
library(AnnotationHubData)
# Since this grabs the file and converts on the fly
# There is actually no need to run with metadataOnly=FALSE
#
# This periodically fails. We realized after a certain number
# of subsequent hits to ensembl ftp site, the site starts asking
# for a username:password or simply blocks entirely
# We have tried increasing the sleep time in between failed attempts
# with a max retry of 3.  Normally this function will run completely
# after 1-3 attempts. Altnerative is to temporarily increase the wait time
# in the ftpFileInfo function in webAccessFunctions.R
#  In the future maybe consider adding in conditional userpwd argumnet to getURL
#
#  FIX ME: 
# More troubling in recent builds is there is not a consistent mapping name from 
#  the ensembl repository (species) to the species in the species file 
#  https://ftp.ensembl.org/pub/release-108/species_EnsemblVertebrates.txt
#  This results in an error
#  Hot fix:  for now a browser() call can be added in makeEnsemblGtfToGranges.R
#    in makeEnsemblGtfToAHM before running meta <- .ensemblMetadataFromUrl(sourceUrls)
#     when you hit this browser, manually subset values to remove bad entry
#    Ideally we want to come up with a better mapping strategy so these could be included
#   

meta <- updateResources("EnsemblGTFtoGranges",
              BiocVersion = "3.6",
              preparerClasses = "EnsemblGtfImportPreparer",
              metadataOnly = TRUE, insert = FALSE,
              justRunUnitTest = FALSE, release = "89")
# test/check meta
pushMetadata(meta, url)

# you could rerun updateResources with insert=TRUE to do the push but I like to check resource data
  1. exit R

  2. Convert db to sqlite (puts the file in the data/ directory)

sudo docker exec annotationhub_annotationhub_1 bash /bin/backup_db.sh

docker exec annotationhub_docker_annotationhub_1 bash /bin/backup_db.sh
  1. If satisfied, copy this file to annotationhub.bioconductor.org and follow instructions for updating production database

2Bit

This takes several hours to run. See section on cross checking if stopped before finished for picking up in the middle with uploads.

  1. Generate or have someone generate a SAS token/url on microsoft azure data lakes for the staging bucket. General recommendations is expire the SAS after two weeks. The scripts potentially need both the token and the url

  2. export the SAS URL as an environment variable AZURE_SAS_URL with something like the following.

export AZURE_SAS_URL='https://bioconductorhubs.blob.core.windows.net/staginghub?sp=racwl&st=2022-02-08T15:57:00Z&se=2022-02-22T23:57:00Z&spr=https&sv=2020-08-04&sr=c&sig=fBtPzgrw1Akzlz%2Fwkne%2BQrxOKOdCzP1%2Fk5S%2FHk1LguE%3D'
  1. In R:
library(AnnotationHubData)
# to populate bucket need metadataOnly = FALSE
# metadataOnly will control upload
# insert controls inserting into database
meta <- updateResources("TwoBit",
            BiocVersion = "3.12",
            preparerClasses = "EnsemblTwoBitPreparer",
            metadataOnly = FALSE, insert = FALSE,
            justRunUnitTest = FALSE, release = "101")

# Because of the upload the meta will not be an ahm object run again
# with metadataOnly=TRUE to save object to copy to local
meta <- updateResources("TwoBit",BiocVersion="3.11",
            preparerClasses="EnsemblTwoBitPreparer",
            metadataOnly=TRUE, justRunUnitTest=FALSE,
            insert=FALSE, release="99")

# a suggested step is to save(meta, file="metadataForTwoBit")
# if running as a script or non interactively.
save(meta, file="TwoBitObj.RData")
  1. Check Azure bucket is being populated.

  2. Once finished cross check that everything populated correctly. You can check if any failed and needs to be rerun with the following.

meta <- updateResources("TwoBit",
              BiocVersion = "3.12",
              preparerClasses = "EnsemblTwoBitPreparer",
              metadataOnly = TRUE, insert = FALSE,
              justRunUnitTest = FALSE,
              release = "101")
temp = rep(NA, length(meta))
for(i in 1:length(meta)){
       tit = metadata(meta[[i]])$RDataPath
       temp[i] = tit
}


## when loading AzureStor will ask about saving credentials; recommend `no`
library("AzureStor")

## You will need the SAS token generated for access
sas <- "sp=racwl&st=2022-02-08T15:57:00Z&se=2022-02-22T23:57:00Z&spr=https&sv=2020-08-04&sr=c&sig=fBtPzgrw1Akzlz%2Fwkne%2BQrxOKOdCzP1%2Fk5S%2FHk1LguE%3D"

## check what files were uploaded
url <- "https://bioconductorhubs.blob.core.windows.net"
ep <- storage_endpoint(url, sas = sas)
container <- storage_container(ep, "staginghub")

## if you changed the first argument of the updateResources to use a different
##directory replace the TwoBit with that
filesOn = gsub(pattern="TwoBit/", "", list_storage_files(container, "TwoBit")[,"name"])


## difference of metadata and what has been uploaded
paths = setdiff(temp, filesOn)

## if there were any that were not uploaded subset and try again
newMeta = meta[match(paths,temp)]
pushResources(newMeta, uploadToRemote = TRUE, download = TRUE)

## in good practice it is also recommended to check the setdiff in the opposite
##   direction as well
  1. Navigate to the AnnotationHub_docker directory

  2. Start the docker:

export MYSQL_REMOTE_PASSWORD=***  (See credentials doc)
docker-compose up
  1. In a new terminal start R:
options(AH_SERVER_POST_URL="http://localhost:3000/resource")
options(ANNOTATION_HUB_URL="http://localhost:3000")
url <- getOption("AH_SERVER_POST_URL")
library(AnnotationHubData)

# option 1:
meta <- updateResources("TwoBit",
            BiocVersion = "3.12",
            preparerClasses = "EnsemblTwoBitPreparer",
            metadataOnly = TRUE, insert = FALSE,
            justRunUnitTest = FALSE, release = "101")


# option 2:
# if you saved the meta
load("metadataForTwoBit")

pushMetadata(meta, url)

# you could rerun updateResources with insert=TRUE to do the push but I like to check resource data
  1. exit R

  2. Convert db to sqlite (puts the file in the data/ directory)

sudo docker exec annotationhub_annotationhub_1 bash /bin/backup_db.sh

docker exec annotationhub_docker_annotationhub_1 bash /bin/backup_db.sh
  1. If satisfied, copy this file to annotationhub.bioconductor.org and follow instructions for updating production database

After GTF or 2bits are added to production database, you should be able to get the resources and see the updated timestamp of database with something like the following

library(AnnotationHub)
hub = AnnotationHub()
length(query(hub, c("ensembl", "gtf", "release-101")))
length(query(hub, c("fasta", "twobit", "releases-101")))

User Contributed Resources

Contributors will generally reach out when wanting to update or include Annotations to the AnnotationHub. In the past they have provided the annotations through an application like dropbox; we have since updated the process and now will directly upload files to S3 in a temporary location. Send them the instructions found here

A key will have to be generated for them to access and use the AnnotationContributor account. Go to here Under the 'Security credentials' tab click 'Create access key'. Send the Access key ID to the contributor and the Secret access key is stored in AWS. When the contributor is done you can delete the key (small 'x' at the right of the key row).

Advise that their data should be in a directory the same name as the software package that will access the annotations; subdirectories to keep track of versions is strongly encouraged.

Once the data is uploaded to S3 move the data to the proper location.

We will need a copy of the package to generate and test the annotaitons. Request link to package from user.

Follow instructions here

In general, generate the list of AnnotationHubMetadata objects with makeAnnotationHubMetadata() or updateResources. To test that the metadata.csv is properly formatted, run makeAnnotationHubMetadata.

Some suggested testing procedures can be found here

When satisfied start the AnnotationHub docker and add resource to docker.

  1. Navigate to the AnnotationHub_docker directory

  2. Start the docker:

export MYSQL_REMOTE_PASSWORD=***  (See credentials doc)
docker-compose up
  1. In a new terminal start R:
options(AH_SERVER_POST_URL="http://localhost:3000/resource")
options(ANNOTATION_HUB_URL="http://localhost:3000")
library(AnnotationHubData)
url <- getOption("AH_SERVER_POST_URL")

# run approprate makeAnnotationHubMetadata() call and
#pushMetadata(meta[[1]], url)
  1. exit R

  2. Test From the list of dockers running sudo docker ps find the process with db in the name. Example test_db1. Connect to the container with sudo docker exec -ti test_db bash. Log into mysql with mysql -p -u ahuser (The password will be the same as the exported MYSQL_REMOTE_PASSWORD). Explore with mysql commands like select * from resources order by id desc limit 5;

  3. Convert db to sqlite (puts the file in the data/ directory) Which command to run to convert the db to sqlite will depend on the name of the process, but it should be one of the followng:

sudo docker exec annotationhub_annotationhub_1 bash /bin/backup_db.sh

docker exec annotationhub_docker_annotationhub_1 bash /bin/backup_db.sh
  1. If satisfied, copy this file to annotationhub.bioconductor.org and follow instructions for updating production database

Run Release time after new builds of annotations

makeStandardOrgDbs

This recipe should be run after the new OrgDb packages have been built for the release are available in the devel repo. The code essentially loads the current packages, extracts the sqlite file and creates some basic metadata.

The BiocVersion should be whatever the next release version will be, the current devel soon to be release. The OrgDb resources get the same name when they are regenerated - they aren't tied to a genome build so that's not a distinguishing feature in the title. We only want 1 OrgDb for each species available in a release and the BiocVersion is what we use to filter records exposed.

new per Nov 21 We want the devel orgDbs to be exposed as soon as generated close to release time for users to updated accordingly. Before running and adding new resources, manually remove references in the BiocVersions table for the current release orgdbs associated with the current devel version. Quickest way to get references. In R perform the query

library(AnnotationHub)
ah = AnnotationHub()
table(mcols(query(ah, "OrgDb"))$rdatadateadded)
## use the dates in this query to filter in sqlite3 to get ids

In sqlite3:

select id from resources where preparerclass="OrgDbFromPkgsImportPreparer" and rdatadateadded="2021-10-08" limit 1;
select id from resources where preparerclass="OrgDbFromPkgsImportPreparer" and rdatadateadded="2021-10-08" order by id desc limit 1;

## use resource ids to get ids from biocversions table

# check first and get ids of only devel version (soon to be release)
# in this scenerio 3.13 is current release, prepping for 3.14 devel to be released
select * from biocversions where biocversion="3.14" and resource_id between 95950 and 95968;

# use highest and lowest id to make delete call
delete from biocversions where id between 298567 and 298585;

Generate or have someone generate a SAS token/url on microsoft azure data lakes for the staging bucket. General recommendations is expire the SAS after two weeks. The scripts potentially need both the token and the url. export the SAS URL as an environment variable AZURE_SAS_URL with something like the following.

export AZURE_SAS_URL='https://bioconductorhubs.blob.core.windows.net/staginghub?sp=racwl&st=2022-02-08T15:57:00Z&se=2022-02-22T23:57:00Z&spr=https&sv=2020-08-04&sr=c&sig=fBtPzgrw1Akzlz%2Fwkne%2BQrxOKOdCzP1%2Fk5S%2FHk1LguE%3D'

On Local Machine:

  1. Navigate to the AnnotationHub_docker directory

  2. Start the docker:

export MYSQL_REMOTE_PASSWORD=***  (See credentials doc)
sudo docker-compose up
  1. In a new terminal start R:
options(AH_SERVER_POST_URL="http://localhost:3000/resource")
options(ANNOTATION_HUB_URL="http://localhost:3000")
url <- getOption("AH_SERVER_POST_URL")
library(AnnotationHubData)

# see the man page for clarification on testing and actively pushing
# ?makeStandardOrgDbsToAHM
# to populate bucket need metadataOnly = FALSE
# metadataOnly will control upload
# insert controls inserting into database
meta <- updateResources("OrgDbs",
            BiocVersion = "3.5",
            preparerClasses = "OrgDbFromPkgsImportPreparer",
            metadataOnly = TRUE, insert = FALSE,
            justRunUnitTest = FALSE,
            downloadOrgDbs=TRUE)
# downloadOrgDbs can be FALSE for subsequent runs

pushMetadata(meta, url)
  1. exit R

  2. Convert db to sqlite (puts the file in the data/ directory)

sudo docker exec annotationhub_annotationhub_1 bash /bin/backup_db.sh

docker exec annotationhub_docker_annotationhub_1 bash /bin/backup_db.sh
  1. If satisfied, copy this file to annotationhub.bioconductor.org and follow instructions for updating production database

  2. new as of nov 2021 Add new orgdbs to manually be visible in next devel. If the current release is 3.13 and devel is 3.14 and this is in preparation for 3.14 release, this would be adding a biocversion reference for 3.15.

In sqlite3 database:

## use whatever date when added
select id from resources where preparerclass="OrgDbFromPkgsImportPreparer" and rdatadateadded="2021-10-08" limit 1;
select id from resources where preparerclass="OrgDbFromPkgsImportPreparer" and rdatadateadded="2021-10-08" order by id desc limit 1;

## test -- should show current devel -- use the two values above for between
select * from biocversions where resource_id between 95950 and 95968;

In a separate R terminal you can quickly concat the call to add all versions at once replacing 3.15 with whatever the next devel is and the low/high of the resource ids:

paste0("(3.15,",95950:95968, ")", collapse=",")

paste that into sqlite call

insert into biocversions (biocversion, resource_id) Values [R output] ;

## now check again that both current devel and future devel show
select * from biocversions where resource_id between 95950 and 95968;
  1. After uploaded to production you can test that they are available in release and devel
query(hub, "OrgDb")
table(mcols(query(hub, "OrgDb"))$rdatadateadded)

makeStandardTxDbs

This recipe should be run after the new TxDbs have been built and are in the devel repo. The code loads the packages, extracts the sqlite file and creates metadata.

The BiocVersion should be whatever the next release version will be, the current devel soon to be release. The OrgDb resources get the same name when they are regenerated - they aren't tied to a genome build so that's not a distinguishing feature in the title. We only want 1 OrgDb for each species available in a release and the BiocVersion is what we use to filter records exposed.

Generate or have someone generate a SAS token/url on microsoft azure data lakes for the staging bucket. General recommendations is expire the SAS after two weeks. The scripts potentially need both the token and the url. export the SAS URL as an environment variable AZURE_SAS_URL with something like the following.

export AZURE_SAS_URL='https://bioconductorhubs.blob.core.windows.net/staginghub?sp=racwl&st=2022-02-08T15:57:00Z&se=2022-02-22T23:57:00Z&spr=https&sv=2020-08-04&sr=c&sig=fBtPzgrw1Akzlz%2Fwkne%2BQrxOKOdCzP1%2Fk5S%2FHk1LguE%3D'

On Local Machine:

  1. Navigate to the AnnotationHub_docker directory

  2. Start the docker:

export MYSQL_REMOTE_PASSWORD=***  (See credentials doc)
sudo docker-compose up
  1. In a new terminal start R:
options(AH_SERVER_POST_URL="http://localhost:3000/resource")
options(ANNOTATION_HUB_URL="http://localhost:3000")
url <- getOption("AH_SERVER_POST_URL")
library(AnnotationHubData)

# see the man page for clarification on testing and actively pushing
# follow example in man page
# will need the list of updated or added TxDbs to make a character list of files

?makeStandardTxDbsToAHM

pushMetadata(meta, url)
  1. exit R

  2. Convert db to sqlite (puts the file in the data/ directory)

sudo docker exec annotationhub_annotationhub_1 bash /bin/backup_db.sh

docker exec annotationhub_docker_annotationhub_1 bash /bin/backup_db.sh
  1. If satisfied, copy this file to annotationhub.bioconductor.org and follow instructions for updating production database

  2. After uploaded to production you can test that they are available in release and devel

query(hub, "TxDb")
table(mcols(query(hub, "TxDb"))$rdatadateadded)

makeNCBIToOrgDbs (Non Standard Orgs)

This code generates ~1700 non-standard OrgDb sqlite files from ucsc. These are less comprehensive and the standard OrgDb packages. It's best to run this on an EC2 instance. You can run it locally if your machine has enough space to download the files from NCBI but keep in mind this code takes several hours to run.

The BiocVersion should be whatever the next release version will be, the current devel soon to be release. The OrgDb resources get the same name when they are regenerated - they aren't tied to a genome build so that's not a distinguishing feature in the title. We only want 1 OrgDb for each species available in a release and the BiocVersion is what we use to filter records exposed.

Before running, make sure the AnnotationForge/inst/extdata/viableIDs.rda and GenomeInfoDbData/data/specData.rda files have been updated and pushed in their respective packages. The scripts to generate these data files are in the packages inst/scripts directories as viableIDs.R and updateGenomeInfoDbData.R respectively.

new per Nov 21 We want the devel orgDbs to be exposed as soon as generated close to release time for users to updated accordingly. Before running and adding new resources, manually remove refreences in the BiocVersions table for the current release orgdbs associated with the current devel version. Quickest way to get references. In R perform the query

library(AnnotationHub)
ah = AnnotationHub()
table(mcols(query(ah, "OrgDb"))$rdatadateadded)
## use the dates in this query to filter in sqlite3 to get ids

In sqlite3:

select id from resources where preparerclass="NCBIImportPreparer" and rdatadateadded="2021-10-13" limit 1;
select id from resources where preparerclass="NCBIImportPreparer" and rdatadateadded="2021-10-13" order by id desc limit 1;

## use resource ids to get ids from biocversions table

# check first and get ids of only devel version (soon to be release)
# in this scenerio 3.13 is current release, prepping for 3.14 devel to be released
select * from biocversions where biocversion="3.14" and  resource_id between 95973 and 97713;

# use highest and lowest id to make delete call
delete from biocversions where id between 298567 and 298585;
  1. Generate or have someone generate a SAS token/url on microsoft azure data lakes for the staging bucket. General recommendations is expire the SAS after two weeks. The scripts potentially need both the token and the url

  2. export the SAS URL as an environment variable AZURE_SAS_URL with something like the following. Also export sas token with AZURE_SAS_TOKEN

export AZURE_SAS_URL='https://bioconductorhubs.blob.core.windows.net/staginghub?sp=racwl&st=2022-02-08T15:57:00Z&se=2022-02-22T23:57:00Z&spr=https&sv=2020-08-04&sr=c&sig=fBtPzgrw1Akzlz%2Fwkne%2BQrxOKOdCzP1%2Fk5S%2FHk1LguE%3D'

export AZURE_SAS_TOKEN='sp=racwl&st=2022-02-08T15:57:00Z&se=2022-02-22T23:57:00Z&spr=https&sv=2020-08-04&sr=c&sig=fBtPzgrw1Akzlz%2Fwkne%2BQrxOKOdCzP1%2Fk5S%2FHk1LguE%3D'
  1. In R:
library(AnnotationHubData)

# see the man page for clarification on testing and actively pushing
# ?makeNCBIToOrgDbsToAHM
# to populate bucket need metadataOnly = FALSE
# metadataOnly will control upload
# insert controls inserting into database

## This can be run in a directory where the example in
## ?AnnotationForge::makeOrgPackageFromNCBI to save time building cache
## if cache was recently generated.  You will need ~62 G of space to run the
## code

## if rebuilding the cache you will likely need to increase the default timeout
##   limit of R

options(timeout=10000)

meta <- updateResources("NonStandardOrgDb",
            BiocVersion = "3.5",
            preparerClasses = "NCBIImportPreparer",
            metadataOnly = TRUE, insert = FALSE,
            justRunUnitTest = FALSE)

# a suggested step is to save(meta, file="metadataForNonStandardOrgs")
# and scp to local machine you will need to have run it with
# metadataOnly=TRUE to get correct form of saved object
  1. Check Azure bucket is being populated.
  2. Once finished check that everything populated correctly with needToRerunNonStandardOrgDb.
needToRerunNonStandardOrgDb(biocVersion="3.5", resourceDir="NonStandardOrgDb", justRunUnitTest = TRUE)

Note: We have had issues in the past with the recipe completing for all desired resources. The receipe, when repeatedly run, will check the appropriate bucket to compare what resources still need to be processed. The helper function needToRerunNonStandardOrgDb can be run to determine if a repeat call should be made. This is important because if all the resources are on azure and metadataOnly=FALSE, it will assume you want to overwrite all the files and begin the generation over.

On Local Machine:

  1. Navigate to the AnnotationHub_docker directory

  2. Start the docker:

export MYSQL_REMOTE_PASSWORD=***  (See credentials doc)
docker-compose up
  1. In a new terminal start R:
options(AH_SERVER_POST_URL="http://localhost:3000/resource")
options(ANNOTATION_HUB_URL="http://localhost:3000")
url <- getOption("AH_SERVER_POST_URL")
library(AnnotationHubData)

# option 1:
meta <- updateResources("NonStandardOrgDb,
            BiocVersion = "3.5",
            preparerClasses = "NCBIImportPreparer",
            metadataOnly = TRUE, insert = FALSE,
            justRunUnitTest = FALSE)


# option 2:
# if you saved the meta from the EC2 instance
load("metadataForNonStandardOrgs")


pushMetadata(meta, url)
  1. exit R

  2. Convert db to sqlite (puts the file in the data/ directory)

sudo docker exec annotationhub_annotationhub_1 bash /bin/backup_db.sh

docker exec annotationhub_docker_annotationhub_1 bash /bin/backup_db.sh
  1. If satisfied, copy this file to annotationhub.bioconductor.org and follow instructions for updating production database

  2. new as of nov 2021 Add new orgdbs to manually be visible in next devel. If the current release is 3.13 and devel is 3.14 and this is in preparation for 3.14 release, this would be adding a biocversion reference for 3.15.

In sqlite3 database:

## use whatever date when added
select id from resources where preparerclass="NCBIImportPreparer" and rdatadateadded="2021-10-13" limit 1;
select id from resources where preparerclass="NCBIImportPreparer" and rdatadateadded="2021-10-13" order by id desc limit 1;

## test -- should show current devel -- use the two values above for between
select * from biocversions where resource_id between 95973 and 97713;

In a separate R terminal you can quickly concat the call to add all versions at once replacing 3.15 with whatever the next devel is and the low/high of the resource ids:

paste0("(3.15,",95973:97713, ")", collapse=",")

paste that into sqlite call

insert into biocversions (biocversion, resource_id) Values [R output] ;

## now check again that both current devel and future devel show
select * from biocversions where resource_id between 95973 and 97713;
  1. After uploaded to production you can test that they are available in release and devel
query(hub, "OrgDb")
table(mcols(query(hub, "OrgDb"))$rdatadateadded)

ExperimentHub

ExperimentHub Resources are added upon request or when it is recommended a package be an Experiment Data Package rather than Software package. The package that will use such data will reach out. It is then a similar process to The AnnotationHubData adding contributor resources.

The user will upload files to S3 in a temporary location. Send them the instructions found here

A key will have to be generated for them to access and use the AnnotationContributor account. Go to here Under the 'Security credentials' tab click 'Create access key'. Send the Access key ID to the contributor and the Secret access key is stored in AWS. When the contributor is done you can delete the key (small 'x' at the right of the key row).

Advise that their data should be in a directory the same name as the software package that will use the Experiment Data when uploading (ie. software package Test would upload files to S3 in a folder Test -> Test\file1, Test\file2, etc). If subdirectories are needed that is okay but ensure the RDataPath in the metadata.csv reflects this structure.

Once the data is uploaded to S3 move the data to the proper location.

We will need a copy of the package to generate and test the annotaitons. Request link to package from user. The following should be sent to user to ensure the package is sent up correctly: instructions.

In general, generate the list of ExperimentHubMetadata objects with makeExperimentHubMetadata() or addResources. To test that the metadata.csv is properly formatted,run makeAnnotationHubMetadata.

Info on ExperimentHub docker and how to set up docker directory.

When satisfied start the ExperimentHub docker and add resource to docker.

  1. Navigate to the ExperimentHub_docker directory

  2. Start the docker:

export MYSQL_REMOTE_PASSWORD=***  (See credentials doc)
docker-compose up
  1. In a new terminal start R:
options(EXPERIMENT_HUB_SERVER_POST_URL="http://localhost:4000/resource")
options(EXPERIMENT_HUB_URL="http://localhost:4000")
library(ExperimentHubData)
url <- getOption("EXPERIMENT_HUB_SERVER_POST_URL")

# run approprate makeExperimentHubMetadata() call and following if necessary
#pushMetadata(meta, url)
  1. exit R

  2. Test From the list of dockers running sudo docker ps find the process with db in the name. Example test_db1. Connect to the container with sudo docker exec -ti test_db bash. Log into mysql with mysql -p -u hubuser (The password will be the same as the exported MYSQL_REMOTE_PASSWORD). Explore with mysql commands like select * from resources order by id desc limit 5;

  3. Convert db to sqlite (puts the file in the data/ directory) Which command to run to convert the db to sqlite will depend on the name of the process, but it should be one of following:

sudo docker exec experimenthubdocker_experimenthub_1 bash /bin/backup_db.sh

docker exec experimenthub_docker_experimenthub_1 bash /bin/backup_db.sh
  1. If satisfied, copy this file to annotationhub.bioconductor.org and follow instructions for updating production database

Some other Notes and helpful hints:

  1. If a new recipe needed to be added, the recipe is added in AnnotationHub. Be sure to then update the version of AnnotationHub dependency in the DESCRIPTION of ExperimentHub and increase the version in ExperimentHub.

  2. Remember when an ExperimentHub package is accepted it is uploaded to a different repo.



Bioconductor/AnnotationHubData documentation built on Feb. 15, 2024, 10:10 a.m.