README.md
In gnoblet/impactR: Ease IMPACT's data and database officers woRk

impactR

Ease IMPACT’s data and database officers woRk

impactR started as a simple project: mainly a reminder of totally-perfectible functions used and made on the go for the Burkina Faso team in 2021. It became broader, aiming now to ease data teams daily R work and to cover most of the research cycle’s tasks.

It is based on three spreadsheets that need to be filled in and coordinated by either assessment officers, data officers or field officers:

To monitor data collection and get a log to fill: a spreadsheet of logical tests based on the questionnaire and the Kobo tool
To clean data: a cleaning log that has been (well-)filled
To analyze data: a data analysis plan

Specs:

mainly, it is aimed at data collection with Kobo
it extensively uses the tidyverse, and srvyr for survey data analysis
since version 0.7.8, it is considered robust enough and has been tested on 3 different
it requires R 4.1+ (mostly for the native pipe |>)

You can install the last version of impactR from GitHub with:

# install.packages("devtools")
devtools::install_github("gnoblet/impactR", build_vignettes = T)

From version 0.6, contributions should go with minimal and complete commits as a good practice. The dev branch will be used from there. Well, in practice, it isn’t much.

Roadmap is as follows:

[x] introduce tidy eval wherever it makes sense
[x] add (re) count columns post-cleaning for multiple choices columns and simple choice’s other column
[x] write more documentation
[x] tidy eval to cleaning functions
[x] dots not as the last arg, not always at least
[ ] functions to create a small report of the values that effectively changed or were removed when cleaning thanks to a cleaning log
[x] more robust check cleaning log and check check list functions
[x] export clean (open)-xlsx files
[ ] add a grouping arg to make_log_outlier()
[ ] (ongoing) MSNA analysis tools : roster (education, demography, WGI), weighting functions, analysis functions
[ ] (ongoing) Split this big mess into several consolidated small packages : a viz one, an analysis one and a cleaning one

There will be a Shiny app for cleaning and monitoring (in French for now) whose repo will be collectoR. It is experimental and based on older versions of impactR.

Youpi! some documentation:

In R, use:

vignette("base_de_travail", "impactR")
vignette("main_workflow", "impactR")

These are basics example of daily uses:

# Attach all functions, equivalent to library("impactR")
box::use(impactR[...])

## basic example codes and uses (not run!)

## Import a csv file with clean names and clean types, do guess types on the max number of linse
# import_csv("data.csv")

## Get colnames for sector foodsec whose variables start with "f_"
# tbl_col_start(data, "f_")

## Group split to a named list
# named_group_split(data, admin2)

## Left join many tibbles
# left_joints(tibble_list, id_col)

## Make an outlier log for all numeric variables in the data.frame/tibble
# make_log_outlier(rawdata, survey, id_col = uuid, i_enum_id)

## Make a log based on logical tests, outliers and "other" answers
# make_all_logs(rawdata, 
#               survey, 
#               check_list,
#               other = "other_", 
#               id_col = uuid, 
#               i_enum_id)

## Clean from log
# make_all_logs(rawdata,
#               log,
#               survey, 
#               choices,
#               other = "other_", 
#               id_col = uuid)

## Calculate weigthed proportion for shelter type by group (e.g. administrative areas or population groups)
# svy_prop(design, s_shelter_type, c(admin1, group_pop), na.rm = T, stat_name = "prop", level = 0.95)

gnoblet/impactR documentation built on March 20, 2023, 2:24 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com