knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The main feature of pigeontools is to provide a reproducible, accessible and easy to write workflow for data processing in Change Detection Tasks. pigeontools splits this process into three separate steps:
This really is just for a single lab to use to automate a common data processing task without copy, pasting, and editing many independent functions. Also, it helps anyway trying to use our data/our methodology for the future. Below we’ll use sample data created in order to demonstrate these functions.
As it says on the tin, pigeon_import is all about importing the specific datafiles into a single list. This does not clean or process the data at all outside of what is completely necessary, but returns all the raw data. Two main reasons for this:
Currently pigeon_import works on the habit, datavyu, and director data. It also has a “default” option that tries it’s best to parse other datafiles, but I wouldn’t count on it.
pigeon_import( method = “default”, pattern.regex = NULL, pattern.exc = NULL, path = getwd())
method -- determines the type of datafile being imported and basic options that influences. Current accepted methods are:
pattern.regex -- uses a regular expression to determine which filenames to read.
pattern.exc -- uses a regular expression to exclude from files caught in the above filenames.
path -- determines where these files are read (default is the current wd)
pigeon_clean takes a raw data list (as exported from above) and cleans each element to become a usable dataframe.
pigeon_clean(x, method = "default")
x -- is the data to be cleaned. It needs to be the raw data (as imported in pigeon_import) in a list.
method -- determines how the data is being cleaned. Current accepted methods are:
pigeon_process takes the cleaned data (from above) and aggregates the looking, creates a single dataframe for all the data, and combines multiple dataframes (e.g. datavyu & habit).
pigeon_process(x = list(), method = "default", endformat = "wide", join = "inner", coder = NULL)
x -- is the data being processed (formatted in a standard way, as produced by pigeon_clean). Needs to be a list of the two data types. It needs to be two exactly for right now.
method -- the types of data being processed. Needs to be in the same order and amount as the elements in list x. Current accepted methods are:
endformat -- determines whether the data will output as "wide" (default) or "long".
join -- determines how the different data will be combined. dplyr joins are used (inner, left, right, semi)
coder -- used only for method = "reliability". Determines who is the reference coder. If left default (NULL) then it'll take whoever is the most frequent coder as the reference.
An example of using all the above functions:
datavyu_raw <- pigeon_import("datavyu", pattern.regex = "NumbR", pattern.exc = "Number Replication") habit_raw <- pigeon_import("habit", pattern.regex = "Number Replication")
datavyu_clean <- pigeon_clean(datavyu_raw, "datavyu") habit_clean <- pigeon_clean(habit_raw, "habit")
finished_data <- pigeon_process(x = list(datavyu_clean, habit_clean), method = c("datavyu", "habit"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.