Let's apply bcstats
in a few examples.
First, consider the following minimal working example. bcstats
comes with two example two data sets. Load the library to get started.
library(bcstatsR)
And then load the two datasets that come bundled with the library.
data(survey) data(bc)
Let's take a look at the survey data.
print(survey)
knitr::kable(survey)
Now, take a look at the back check data (i.e., the follow up where highly trained surveyors interview the same households).
print(bc)
knitr::kable(bc)
In this example, gender
, gameresult
and itemssold
are the variables collected in both the survey and the back check. Note that id
identifies the respondent in both the survey and the back check. In the survey, enum
and enumteam
tells us the surveyor and the team of the surveyor. We'll want to know whether or not these surveyors and teams collected the data correctly in the survey. Similarly, in the back check, we'll want to summarize the data by back checker to see if we notice unusual patterns.
Now, let's run the back check!
result <- bcstats(surveydata = survey, bcdata = bc, id = "id", t1vars = "gender", t2vars = "gameresult", t3vars = "itemssold", enumerator = "enum", enumteam = "enumteam", backchecker = "bcer")
And auto-magically, you've created a bunch of results stored in result
. Let's take a look at back check, which has been stored in result$backcheck
.
print(result$backcheck)
knitr::kable(result$backcheck)
Each row contains the difference between the survey and the back check by each household and variable. Cases where nothing changed have not been included in this data.frame. Now let's take a look at the error rates for Type 1 variables by each surveyor (enumerator).
print(result[["enum1"]]$summary)
knitr::kable(result[["enum1"]]$summary)
We can also take at the error rate for each Type 1 variable by enumerator.
print(result[["enum1"]]$each)
knitr::kable(result[["enum1"]]$each)
And we can do the same thing for Type 2 variables.
print(result[["enum2"]]$summary) print(result[["enum2"]]$each)
knitr::kable(result[["enum2"]]$summary) knitr::kable(result[["enum2"]]$each)
Now let's redo the back check where this time we do a t-test for the differences between the survey data and the back check.
result <- bcstats(surveydata = survey, bcdata = bc, id = "id", t1vars = "gender", t2vars = "gameresult", t3vars = "itemssold", enumerator = "enum", enumteam = "enumteam", backchecker = "bcer", ttest = "itemssold")
You can find the results for the t-test as an element of the results list.
print(result[["ttest"]]$itemssold)
We could have choosen to not code some changes as errors as follows,
result <- bcstats(surveydata = survey, bcdata = bc, id = "id", t1vars = "gender", t2vars = "gameresult", t3vars = "itemssold", enumerator = "enum", enumteam = "enumteam", backchecker = "bcer", nodiff = list(itemssold = c(0)))
or specify an acceptable range,
result <- bcstats(surveydata = survey, bcdata = bc, id = "id", t1vars = "gender", t2vars = "gameresult", t3vars = "itemssold", enumerator = "enum", enumteam = "enumteam", backchecker = "bcer", okrange = list(itemssold = c(0, 5)))
or exclude them all together.
result <- bcstats(surveydata = survey, bcdata = bc, id = "id", t1vars = "gender", t2vars = "gameresult", t3vars = "itemssold", enumerator = "enum", enumteam = "enumteam", backchecker = "bcer", exclude = list(itemssold = c(0)))
Of course, you'll want to check multiple variables within any given type. You can just pass those as variable names as a list. For example, if you want to run the back check with both gender
and gameresult
as Type 1 variables, you could do the following:
result.mv <- bcstats(surveydata = survey, bcdata = bc, id = "id", t1vars = c("gender", "gameresult"), t3vars = "itemssold", enumerator = "enum", enumteam = "enumteam", backchecker = "bcer", exclude = list(itemssold = c(0)))
Check out all the features of bcstats
in the help page and post an issue on GitHub if you encounter any problems.
help(bcstats)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.