The specimen table script is no longer needed because we are no longer using dccvalidator or the underlying specimen table. The issues in this repo contain some documentation on the original harmonization/clean-up of the ROSMAP, MSBB, and MayoRNASeq study metadata in 2020 but otherwise this repo is not active.
Tools for cleaning and organizing study data for the AD Knowledge Portal.
You can install the cleanAD package from GitHub using the remotes
package.
remotes::install_github("Sage-Bionetworks/cleanAD")
The R script in inst/scripts/generate-table-ad.R will generate and upload a specimen ID table from metadata files and annotations in the format needed by the dccvalidator for checking metadata files against existing specimen and individual IDs. This script requires a configuration file located at inst/config.yml. Currently, the script does not allow for configuration files that are not installed with the package.
bash
+-- studyname1
| +-- Metadata
+-- studyname2
| +-- Data
| | +-- Metadata
completed_successfully
boolean annotation. If the task succeeds, the annotation will be updated to true
, else false
. Will only update annotations on the entity if the script was able to log into Synapse and has update permissions on the entity.There are 3 ways to run the script: Rscript, bash script, docker.
Note: This is not recommended for running locally unless testing on dummy data. You should always run this script in a safe computing environment to ensure no PHI is downloaded to your local system.
Ensure cleanAD is installed in your local system and you are able to run scripts via Rscript. You should be able to run the script with the following:
Rscript ./cleanAD/inst/scripts/generate-table-ad.R --config <config to use (e.g. default)> --auth_token <Synapse personal access token or have local .synapseConfig>
If you are on a Linux or Mac computer, you can use the included bash script to launch the R script, update_table.sh. Ensure cleanAD is installed in your local system and you are able to run scripts via Rscript. You should be able to run the script with the following:
./cleanAD/update_table.sh <config to use (e.g. default)> <Synapse personal access token or have local .synapseConfig>
A docker image has been created for running this script. You can use the docker by either building the image yourself with the included Dockerfile or pulling the sagebionetworks/cleanad image from the cloud. The DockerHub image is automatically built after pushes to the main branch in this repository, although there may be a lag of up to 6 hours before the image is updated.
docker pull sagebionetworks/cleanad:latest
docker run --rm --entrypoint "./cleanAD/update_table.sh" sagebionetworks/cleanad:latest <config to use (e.g. default)> <Synapse personal access token or have local .synapseConfig>
To build the image locally, follow the steps below.
git clone https://github.com/Sage-Bionetworks/cleanAD.git
cd cleanAD
docker build -t cleanad .
Provide a Synapse PAT to the Scheduled Job secrets field named "SYNAPSE_AUTH_TOKEN":"<your-PAT-here>"
.
Pull the public docker image and provide the following command to execute in the container (ad
and TRUE
are arguments to config
and as_scheduled_job
respectively).
./cleanAD/scheduled_job_update_table.sh ad TRUE
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.