repijson"

EpiJSON is a generic JSON format for storing epidemiological data.

repijson is an R package that allows conversion between EpiJSON files and R data formats.

This vignette is a demonstration of the package repijson.

Epidemiological data is often stored and transferred as spread-sheets, databases, and text files with little standardisation in row, column and field names. A universal format enabling the coherent storage and transfer of these data is lacking. In many cases where transfer does occur, there is room for misinterpretation and preventable errors may be introduced into reports and analyses.

EpiJSON provides a potential solution for the unambiguous storage and transfer of epidemiological data. repijson facilitates the use of EpiJSON within R.

Installing repijson


To install the development version from github:

library(devtools)
install_github("hackout2/repijson")

Then, to load the package, use:

library("repijson")

The EpiJSON format

This is a simplified representation of the EpiJSON format.

#here deliberately not echoed just to show the diagram
epijsonObjectVis(textSize = 5) #textSize 4 is default

The repijson objects used to store EpiJSON are represented in the following diagram.

#here deliberately not echoed just to show the diagram
epijsonObjectVis( attribMeta = 'ejAttribute',
                  attribRecord = 'ejAttribute',
                  attribEvent = 'ejAttribute',
                  labelObject = 'ejObject',
                  labelMeta = 'ejMetadata',
                  labelRecord = 'ejRecord',
                  labelEvent = 'ejEvent',
                  textSize = 5) #textSize 4 is default

First simple example creating an repijson object from a dataframe

toyll is a small example dataframe within the repijson package. It follows the structure of a disease outbreak line list, with individuals in rows and data stored in columns. This is what the first 3 rows and 8 columns look like :

toyll[1:3,1:8]

The example below creates an ejObject from the first 3 rows. It assigns the columns "name" and "gender" as record attributes. It defines two events, "admission" and "discharge". The "hospital" column is assigned as an attribute of the first event.

#converting dates to date format
toyll$date.of.admission <- as.POSIXct(toyll$date.of.admission)
toyll$date.of.discharge <- as.POSIXct(toyll$date.of.discharge)
#create ejObject
ejOb <- as.ejObject(toyll[1:3,],
                  recordAttributes=c("name","gender"),
                  eventDefinitions=list(
                      define_ejEvent(name="admission", date="date.of.admission",attributes="hospital"),
                      define_ejEvent(name="discharge", date="date.of.discharge")
                  ))
#display ejObject
ejOb

Load the required packages for further examples.

library(OutbreakTools)
library(sp)
library(HistData)

Creating example dataframe 1.

data(Snow.deaths)

Adding some dates, pumps, some genders

simulated <- Snow.deaths
simulated$gender <- c("male","female")[(runif(nrow(simulated))>0.5) +1]
simulated$date <- as.POSIXct("1854-04-05") + rnorm(nrow(simulated), 10) * 86400
simulated$pump <- ceiling(runif(nrow(simulated)) * 5)

exampledata1<-head(simulated)
exampledata1

Creating example dataframe 2.

exampledata2<- data.frame(id=c(1,2,3,4,5),
                 name=c("Tom","Andy","Ellie","Ana","Tibo"),
                 dob=c("1981-01-12","1980-11-11","1982-02-10","1981-12-09","1983-03-08"),
                 gender=c("male","male","female","female","male"),
                 date.of.onset=c("2014-12-28","2014-12-29","2015-01-03","2015-01-08","2015-01-04"),
                 date.of.admission=c(NA,"2015-01-05","2015-01-12",NA,"2015-01-14"),
                 date.of.discharge=c(NA,NA,"2015-01-17",NA,"2015-01-17"),
                 hospital=c(NA,"St Marys","Whittington",NA,"Whittington"),
                 fever=c("yes","yes","no","no","yes"),
                 sleepy=c("no","yes","yes","no","yes"),
                 contact1.id=c("B","A","5",NA,"3D"),
                 contact1.date=c("2014-12-26","2014-12-26","2014-12-28",NA,"2014-12-28"),
                 contact2.id=c("3D","3D","5",NA,"A"),
                 contact2.date=c("2014-12-25","2014-12-26","2015-01-14",NA,"2014-12-25"),
                 contact3.id=c("B",NA,NA,NA,NA),
                 contact3.date=c("2015-01-08",NA,NA,NA,NA)
                 )

exampledata2

Transition 1: data.frame to EpiJSON format

Use the repijson package to convert a data.frame object into a EpiJSON object within R:

eg1 <- as.ejObject(exampledata1,    
    recordAttributes = "gender",    
    eventDefinitions = list(define_ejEvent(date="date", name="Death", location=list(x="x", y="y", proj4string=""), attributes="pump")),
        metadata=list())               
eg1

The repijson package does not convert dates represented as strings for you. This is because the process of conversion from character to date-time is fraught with difficulty and the hidden corruption of dates is much worse than being told by R to provide date objects. Here we convert the dates in the example two data to real dates. We use POSIXct as this is more firendly to data.frames.

exampledata2$date.of.onset <- as.POSIXct(exampledata2$date.of.onset)
exampledata2$date.of.admission <- as.POSIXct(exampledata2$date.of.admission)
exampledata2$date.of.discharge <- as.POSIXct(exampledata2$date.of.discharge)
exampledata2$contact1.date <- as.POSIXct(exampledata2$contact1.date)
exampledata2$contact2.date <- as.POSIXct(exampledata2$contact2.date)
exampledata2$contact3.date <- as.POSIXct(exampledata2$contact3.date)

We are now set to convert the exampledata2 dataframe to an EpiJSON object.

eg2 <- as.ejObject(exampledata2, recordAttributes = c("id","name","dob","gender"),
     eventDefinitions = list(define_ejEvent(name="Date Of Onset", date="date.of.onset", 
                                            attributes=list()),
                             define_ejEvent(name="Hospital admission", date="date.of.admission", 
                                            attributes=list("hospital", "fever", "sleepy")),
                             define_ejEvent(name="Hospital discharge", date="date.of.discharge"),
                             define_ejEvent(name="Contact1", date="contact1.date", attributes=list("contact1.id")),
                             define_ejEvent(name="Contact2", date="contact2.date", attributes=list("contact2.id")),
                             define_ejEvent(name="Contact3", date="contact3.date", attributes=list("contact3.id"))
                        ),
        metadata=list())
eg2

Transition 2: EpiJSON object to data.frame format

Use the repijson package to convert a JSON object into a data.frame object:

as.data.frame(eg1)
as.data.frame(eg2)

Transition 3: From obkData to an EpiJSON object

These are example data in obkData format

data(ToyOutbreak) 

Use the repijson package to convert an obkData object to JSON object into :

eg3 <- as.ejObject(ToyOutbreak)

Transition 4: From an EpiJSON object to obkData

Next function to produce

Transition 5: From an EpiJSON object to spatial

Use the repijson package to convert from an EpiJSON object to spatial (sp). Here we get the location of all the events as a SpatialPointsDataFrame

sp_eg1 <- as.SpatialPointsDataFrame.ejObject(eg1)
plot(sp_eg1,pch=20,col="green")
text(10,17,"Example from Snow Deaths data")


Try the repijson package in your browser

Any scripts or data that you put into this service are public.

repijson documentation built on May 29, 2017, 2:12 p.m.