An introduction to formhub.R ========================================================

Installation

formhub.R makes is easy to download and work with datasets on formhub. After downloading, formhub.R post-processes your dataset to convert the different columns to the correct type, which it derives from the type you specified during the creation of your XLSform. It is distributed as an R package called formhub which is not in CRAN yet, and can be installed in the following way:

  install.packages("devtools")
  library(devtools)
  install_github("formhub.R", username="SEL-Columbia")

The install_github line will need to be re-run every time you need to update the package, which will be frequent for now, as the package is in early testing. After installation, it can be loaded like you load any other R package:

library(formhub)

Download your first dataset

At this point, we should be ready to get started, and use some of the formhub functions. Likely the most useful, and the most basic, one is called formhubDownload. Try typing in help(formhubDownload) in your R terminal to see what it does. We'll use it to download the good_eats form from mberg's account in formhub, which is a public dataset and doesn't require a password. (To download data from an account with a password, simply pass it along as the third parameter).

good_eats <- formhubDownload("good_eats", "mberg")

The formhubData Object

Question: what kind of beast did we just download?

str(good_eats)

R tells us something like 'data.frame': 78 obs. of 19 variables: as well as Formal class 'formhubData' [package ".GlobalEnv"] with 5 slots. What this means is that formhubData objects can be dealt with data.frames (which makes them very convenient!) and well as "objects" with more properties (such as form, which is derived from your XLSform). The form gives formhub.R information about the exact question that was asked, and the type of the question asked (was it text or select one? or was it a date?), which lets the library change the types of the values to make them right, which is basically the power of this package.

For simplicity, if you want just a data frame and not this complicated formhubData object, you can always use the data.frame method.

good_eats_pure_data_frame <- data.frame(good_eats)

What formhub.R does for you -- type conversions

So the part where R downloaded your data for you was pretty cool. But there is more to the formhubDownload function than just downloading. In the background, the types of each of the columns is converted according to how the data was collected.

# lets inspect the types of the first 10 columns of our downloaded data
str(data.frame(good_eats)[1:10])

Notice that submit_data and submit_date, both of which were today (ie, date) questions in your form, are converted to POXIXct, which is R's date type. What does this mean? That means that we can do date-time calculations, for example, to check how long mberg has been collecting data:

max(good_eats$submit_data) - min(good_eats$submit_date)

Over a year... awesome!

Similiarly, things like select one, imei, and others are converted to factors, integers and decimals to numbers. Lets see how this compares with if we had simply just read the file as a csv without any type conversions:

good_eats2 <- read.csv("~/Downloads/good_eats_2013_05_05.csv")
# lets inspect the types of the first 10 columns of our downloaded data
str(good_eats2[1:10])

Everything is a factor! Why is that bad? Well, see the plots below for yourself:

# install.packages("ggplot2") if you don't have ggplot2 installed yet
library(ggplot2)
qplot(data=good_eats2, x=amount) # from data read in without formhub.R
qplot(data=good_eats, x=amount) # from data read in using formhub.R

Other functions in formhub.R

Okay, hopefully by now, you are sold on the usefulness of formhub.R, and see some value in it. Since this is a "basics of" document, I'll end by describing a couple of other high-level functions in formhub.R (lower-level functions will be documented over time).

And thats really the gist of it!

What if I get an error while running a function?

This is software that has been tested by only a couple of use cases so far, and writing good code in R is pretty tricky, so there are probably bugs! If you encounter one, please go to your form page, and under "Sharing", give the username "prabhasp" "View" privileges, and file an issue on github



prabhasp/formhub.R documentation built on May 25, 2019, 11:25 a.m.