knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

DataSpaceR

R build status codecov CRAN Status Project Status: Active – The project has reached a stable, usable state and is being actively developed. lifecycle

DataSpaceR is an R interface to the CAVD DataSpace, a data sharing and discovery tool that facilitates exploration of HIV immunological data from pre-clinical and clinical HIV vaccine studies.

The package is intended for use by immunologists, bioinformaticians, and statisticians in HIV vaccine research, or anyone interested in the analysis of HIV immunological data across assays, studies, and time.

This package simplifies access to the database by taking advantage of the standardization of the database to hide all the Rlabkey specific code away from the user, and it allows the users to access the study-specific datasets via an object-oriented paradigm.

Examples & Documentation

For more detailed examples and detailed documentation, see the introductory vignette and the pkgdown site.

Installation

Install from CRAN:

install.packages("DataSpaceR")

You can install the latest development version from GitHub with devtools:

# install.packages("devtools")
devtools::install_github("ropensci/DataSpaceR")

Register and set DataSpace credential

The database is accessed with the user's credentials. A netrc file storing login and password information is required.

  1. Create an account and read the terms of use
  2. On your R console, create a netrc file using a function from DataSpaceR:
library(DataSpaceR)
writeNetrc(
  login = "yourEmail@address.com", 
  password = "yourSecretPassword",
  netrcFile = "/your/home/directory/.netrc" # use getNetrcPath() to get the default path 
)

This will create a netrc file in your home directory.

Alternatively, you can manually create a netrc file in the computer running R.

The following three lines must be included in the .netrc or _netrc file either separated by white space (spaces, tabs, or newlines) or commas. Multiple such blocks can exist in one file.

machine dataspace.cavd.org
login myuser@domain.com
password supersecretpassword

See here for more information about netrc.

Usage

The general idea is that the user:

  1. creates an instance of DataSpaceConnection class via connectDS
  2. browses available studies and groups in the instance via availableStudies and availableGroups
  3. creates a connection to a specific study via getStudy or a group via getGroup
  4. retrieves datasets by name via getDataset

for example:

library(DataSpaceR)

con <- connectDS()
con

connectDS() will create a connection to DataSpace.

available studies can be listed by availableStudies field

knitr::kable(head(con$availableStudies))

available groups can be listed by availableGroups field

knitr::kable(con$availableGroups)

Note: A group is a curated collection of participants from filtering of treatments, products, studies, or species, and it is created in the DataSpace App.

Check out the reference page of DataSpaceConnection for all available fields and methods.

create an instance of cvd408

cvd408 <- con$getStudy("cvd408")
cvd408
class(cvd408)

available datasets can be listed by availableDatasets field

knitr::kable(cvd408$availableDatasets)

which will print names of available datasets.

Neutralizing Antibody dataset (NAb) can be retrieved by:

NAb <- cvd408$getDataset("NAb")
dim(NAb)
colnames(NAb)

Check out the reference page of DataSpaceStudy for all available fields and methods.

Note: The package uses a R6 class to represent the connection to a study and get around some of R's copy-on-change behavior.

Meta

ropensci_footer



ropensci/DataSpaceR documentation built on Feb. 25, 2025, 12:25 a.m.