knitr::opts_chunk$set( collapse = TRUE, #,comment = "#>", eval=F )
The flow of an analysis according to the data analysis guidelines:
The implementation in hypegrammaR:
Any analysis with HypegrammaR follows the same structure:
remotes::install_github('ellieallien/hypegrammaR',build_opts = c()) remotes::install_github('mabafaba/surveyweights',build_opts = c()) remotes::install_github('mabafaba/koboquest',build_opts=c())
library(hypegrammaR)
All input files are expected as csv files.
Each input we usually expect with an assessment has it's own function to load it. They check that the format is in line with what is expected, make sure they play well with each other and prepare the functionality they are used for.
First the data. A csv file with data in standard Kobo format.
assessment_data<-load_data(file = "../data/testdata.csv")
Then a sampling frame. A csv file with one column with strata names, one column with population numbers. The strata names must match exactly some values in the data. We must tell the loading function which column is what in the sampling frame.
sampling_frame<-load_samplingframe("../data/test_samplingframe.csv")
Finally the questionnaire, which depends on the question and the choices sheet as a csv.
questionnaire<-load_questionnaire(data = assessment_data, questions = "../data/test_questionnaire_questions.csv", choices = "../data/test_questionnaire_choices.csv", choices.label.column.to.use = "label::English" )
weighting <- map_to_weighting( sampling.frame = sampling_frame, data.stratum.column = "stratification", sampling.frame.population.column = "population", sampling.frame.stratum.column = "strata.names", data = assessment_data)
You need to know:
direct_reporting
group_difference
limit
case <- map_to_case(hypothesis.type = "group_difference", dependent.var.type = "numerical", independent.var = "categorical") case
result<-map_to_result(data = assessment_data, dependent.var = "number_simultaneous_unmet_need", independent.var = "region", case = case, weighting = weighting)
The map_to_result
function gives you a number of things:
First, a message telling you how it went:
# result$message
That's what we want to see. If something went wrong, it should tell you here what happened.
# result$parameters
As you can see, it remembers what your input parameters were. It also added a standardised name of the analysis case.
# result$summary.statistic
In this case, "numbers" are averages, because the input variable was numerical. min
and max
is the corresponding confidence interval. dependent.var.value
give the corresponding variable values if they are categorical (NA
otherwise.)
The summary statistic will always be organised with exactly these columns, no matter what analysis you did. This is so that if you add a new visualisations or ouput format, it will work for any output from this function.
Next, there's information on which (if any) hypothesis test was used and the p value:
result$hypothesis.test
You'll probably be most interested in the p-value and the type of test that was used.
chart<-map_to_visualisation(result) heatmap<-map_to_visualisation_heatmap(result) chart
For advanced users (that know ggplot): The visualisation function returns a ggplot object, so you can add/overwrite ggplot stuff; for example:
myvisualisation+coord_polar()
result %>% map_to_labeled(questionnaire) -> result_labeled chart <- result_labeled %>% map_to_visualisation heatmap <- result_labeled %>% map_to_visualisation_heatmap
map_to_file(chart,"barchart.jpg") map_to_file(result$summary.statistic,"summary_statistics.csv")
The grammar is built from two types of elements: - "Blocks": Take the output of a mapping - "Mappings": Decide what to do, call a "block" that does it, and returns another block.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.