Note: Please read this confluence page which explains the complete architecture of how RDocumentation works.
R Package that uses pkgdown
package, to parse R package documentation and pass it on to the next Lambda worker to upload the documentation to the RDocumentation database.
We have forked our own version of pkgdown
which we use here: https://github.com/datacamp/pkgdown
devtools
installed to ease local developmentGITHUB_PAT
R
remotes::install_github("datacamp/pkgdown", ref = "master")
install.packages("aws.sqs", repos = c(getOption("repos"), "http://cloudyr.github.io/drat"))
RPackageParser.RProj
in RStudio.R
res <- process_package("https://cran.r-project.org/src/contrib/Archive/R6/R6_2.5.0.tar.gz", "R6", "cran")
First, add a file .env.R
in the package root folder with info that AWS needs:
Sys.setenv(AWS_ACCESS_KEY_ID = "ACCESS_KEY_ID",
AWS_SECRET_ACCESS_KEY = "SECRET_ACCESS_KEY",
AWS_DEFAULT_REGION = "us-east-1",
DEST_QUEUE = "rdoc-app-worker",
SOURCE_QUEUE = "rdoc-r-worker",
DEADLETTER_QUEUE = "rdoc-r-worker-deadletter")
You need to add AWS keys that have write access to the SQS queues so that you can post messages to the queue.
You can find AWS_ACCESS_KEY_ID
in the AWS Parameter Store, but AWS_SECRET_ACCESS_KEY
will be encrypted there so you will need to request that value from the infra team.
After that, you can run main()
; this will poll the SQS queues and do all the processing:
RPackageParser::main()
If you want to add messages to the queue for local testing, setup the aws cli and then run:
aws sqs send-message --queue-url https://queue.amazonaws.com/301258414863/rdoc-r-worker --message-body '{"name":"ReorderCluster","version":"1.0","path":"ftp://cran.r-project.org/pub/R/src/contrib/ReorderCluster_1.0.tar.gz"}'
where you replace the body with the package that you want to test.
Note that this is the production queue, which means that the queue will be processed both by your local parser and the production parser, and whoever pics the message first will be the one to process it. That's why you might need to send a few requests until your local parser can pick the message.
After you added your message to the rdoc-r-worker queue, you should see it for a brief moment in AWS while its being processed. After the processing is done, you should be able to see new messages in rdoc-app-worker queue (click on the "Poll for messages" button in the aws console).
If you just want to test pulling a package and generating the output that will be added to the destination queue, just open this project in RStudio and run these commands in the console:
devtools::load_all(".")
library("RPackageParser")
res <- process_package("https://cran.r-project.org/src/contrib/REdaS_0.9.4.tar.gz", "REdaS", "cran")
: replace these arguments with the ones of the package you want to test.write(jsonlite::toJSON(res$topics[[1]],auto_unbox = TRUE), file = 'topic.json')
: this will create a topic.json
file in the root of the project that contains the JSON that will be added to the queue. This is what the API will process before adding the topic to the mysql database.vx.y.z
are deployed to productionAdd the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.