NOTE: This package has been replaced with a newer version. See here:
https://github.com/ki-tools/kitools-r
and here:
https://ki-tools.github.io/kitools-r/
The kitools R package provides utility functions for setting up Knowledge Integration (Ki) projects and supporting workflows within these projects, such as finding data and publishing results.
This package is in the extreme early development process and is expected to seriously evolve over time. Although hosted publicly on GitHub, it is not of use to the general public - only to those inside Ki.
More documentation will be made available as the package matures.
# install.packages("devtools")
devtools::install_github("ki-tools/kitools")
General guidelines:
This package provides utility functions that you must use to set up a new Ki analysis project. These set up minimal project structure, with the goal of giving you freedom while still maintaining standard behavior where necessary for Ki reproducibility and tracking needs.
create_analysis()
/R
for analysis R code/data/core
to locally store core datasets pulled from Synapse, pulled with data_use(id)
(see "Working with data" for more on core vs. discovered vs. derived datasets)/data/discovered/_raw
to locally place raw data files for discovered datasets/data/discovered
to locally place processed discovered datasets that you are ready to publish to Synapse and share with others, published with data_publish(path)
/data/derived
to locally place datasets derived from core or discovered datasets, typically as the result of an analysis or summarization, also published with data_publish(path)
/data/scripts
to store R scripts that operate on raw discovered data to create discovered data, or operate on core or discovered data to produce derived datausethis::use_git()
to intialize a git repository for the analysisusethis::use_github(organisation = "ki-analysis")
to set a GitHub remote at ki-tools/__analysis-name__
Names for projects should be descriptive, use lowercase text, and use dashes to separate words. This convention should be used for naming all other files in an analysis as well.
There are three main classes of data types in a Ki analysis:
There are several utility functions for working with data in kitools. The purpose for these data management functions is to help you keep the datasets you are using and producing organized and sharable with others. Each dataset you use or produce can be registered to your project and synced with Synapse, so that others who check out your code can easily get up to speed with. This also helps provide some additional provenance of what datasets are used in what analyses. These utilities also help you separate your code storage on Github from your data storage on Synapse.
Data management functions:
data_use(synapse_id)
: This is used when you need to register a core dataset with your analysis and download a local copy.data_publish(path, ...)
: This is used when you need to register a discovered or derived dataset with your analysis and push it to the appropriate place on Synapse.data_sync()
: This is used when you want to make sure you have a local copy of all datasets that have been associated with your analysis project. This is useful for collaborative environments where someone else might check out your code and wants to pull all data files associated with the project.Datasets that have been registered with your analysis project are updated in your "project_config.yml" file and you can view this file to see what's registered with your project.
On project creation, an "R" directory is created in which you can add R scripts. Here you are free organize your files however you like, but you are encouraged to use descriptive file names with all lowercase, no spaces, and underscores to separate words, followed with a capital ".R" file extension.
An often-recommended approach to organizing R analysis code is to write your code as an R package. There are benefits to this approach, but we do not want to put too many constraints on project structure. There is nothing stopping you from making your code as a package, as the current project setup only requires a data and an R
folder, a subset of an R package. However,if there are aspects of your analysis code that could be of general use beyond your analysis, you should think of creating separate R packages for those as appropriate. For example, in the CIDACS Brazil analysis, general functionality for transforming DATASUS data was deemed to be generally useful as a separate R package and developed accordingly.
To make it easier to work in a collaborative manner, we strongly recommend adhering to the tidyverse style guide for your analysis code.
If you find yourself in need to develop a Ki-related R package, this package provides utilities to bootstrap your package in a way that conforms with Ki R package development guidance.
Several functions in the usethis package are leveraged for a ki R package setup. If you have not used usethis, some useful setup instructions can be found here.
The following steps are recommended to set up a ki R package:
usethis::create_package()
to create a package (if not already created)usethis::use_git()
to initialize a git repositoryusethis::use_readme_md()
to create a basic README.md file (or usethis::use_readme_rmd()
for R Markdown)usethis::use_testthat()
to set up unit testskitools::use_lintr_test()
to add a unit test for correct code stylekitools::use_apache_license()
to use the Apache 2.0 license (with copyright BMGF)usethis::use_github(organisation = "ki-tools")
to set a GitHub remote at ki-tools/__package-name__
usethis::use_tidy_ci()
to set up travis-ci continuous integration and code coverage To make it easier to work in a collaborative manner, we adhere to the tidyverse style guide.
You should add a unit test to your package that will cause the package check to fail if the code does not meet the style guide. You can set this up with a utility function kitools::use_lintr_test()
.
usethis::use_pkgdown()
.Please note that the 'kitools' project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.