filter_data: Apply and document inclusion/exclusion criteria

Description Usage Arguments Details Value Examples

View source: R/filter_data.R

Description

This function is designed to assist in the application and documentation of inclusion and exclusion criteria to clinical or epidemiologic datasets. It allows the analyst to define a series of functions, each corresponding to one criteria, and then apply them to the dataset. It is expected that in many analyses certain inclusion/exclusion criteria or groups of criteria may be applied sequentially. This function allows analysts to specify filtering functions in ... in the order that they should be applied. The subsequent output of filter_data is a list containing the newly filtered dataset and a report of total observations removed for each criteria.

Usage

1

Arguments

data

a data.frame containing a dataset to be filtered

...

functions or lists of functions to be applied to data. Each function must be able to take data as an argument and remove observations (rows) failing a user-defined exclusion criteria

Details

The input of this function is an unfiltered dataset (data) and a series of functions or list(s) of functions accepted by .... Each function specified in ... should apply one inclusion/exclusion criteria to data. Functions specified in ... should take data as the input, remove observations (rows) that failed to meet a certain inclusion criteria (or which meet a certain exclusion criteria), and return a data.frame of the same or fewer rows as data.

Each functions may be supplied to ... as either 1) a single function; or 2) as a list of functions. Each argument captured by ... will be treated as a separte phase of the data-filtering process, with each phase being applied sequentially to the dataset. This may affect the number of observations listed as failing individual inclusion/exclusion criteria within reports contained in filter_data's output. It will not affect the the filtered dataset generated in filter_data's output.

Several criteria may be applied simultaneously as part of a single filtering "phase". To do this, multiple functions may be specified within a list captured by .... Functions supplied directly to ... will be treated as their own individual phase in the data-filtering process.

Descriptions of inclusion/exclusion criteria may be specified by supplying functions to ... as named arguments or named lists.

Value

A list containing the filtered dataset, a criteria-specific report, and a phase-specific report

Examples

1
2
3
4
5
6
7
8
9
## Not run: 
data <- data.frame(A = 1:10, B = LETTERS[1:10],
                   stringsAsFactors = FALSE)
filter_data(data,
            list(remove_A_equals_2 = function(x) x[x[, 1] != 2, ],
                 remove_A_equals_8 = function(x) x[x[, 1] != 8, ]),
            list(remove_B_equals_E = function(x) x[x[, 2] != "E", ]))

## End(Not run)

graggsd/sgexcrit documentation built on April 20, 2020, 2:50 a.m.