get_responses: Selecting data

View source: R/data_selection.R

get_responsesR Documentation

Selecting data

Description

Extract data from a dexter database

Usage

get_responses(
  dataSrc,
  predicate = NULL,
  columns = c("person_id", "item_id", "item_score")
)

Arguments

dataSrc

a connection to a dexter database, a matrix, or a data.frame with columns: person_id, item_id, item_score

predicate

an expression to select data on

columns

the columns you wish to select, can include any column in the project, see: get_variables

Details

Many functions in Dexter accept a data source and a predicate. Predicates are extremely flexible but they have a few limitations because they work on the individual response level. It is therefore not possible for example, to remove complete person cases from an analysis based on responses to a single item by using just a predicate expression.

For such cases, Dexter supports selecting the data and manipulating it before passing it back to a Dexter function or possibly doing something else with it. The following example will hopefully clarify this.

Value

a data.frame of responses

Examples


## Not run: 
# goal: fit the extended nominal response model using only persons 
# without any missing responses
library(dplyr)

# the following would not work since it will omit only the missing 
# responses, not the persons; which is not what we want in this case
wrong = fit_enorm(db, response != 'NA')

# to select on an aggregate level, we need to gather the data and 
# manipulate it ourselves
data = get_responses(db, 
   columns=c('person_id','item_id','item_score','response')) |>
   group_by(person_id) |>
   mutate(any_missing = any(response=='NA')) |>
   filter(!any_missing)

correct = fit_enorm(data)


## End(Not run)

dexter documentation built on Sept. 11, 2024, 6:42 p.m.