filterData: Filter dataset based on specified filters.

Description Usage Arguments Value Author(s) Examples

View source: R/dataManipulation-filterData.R

Description

A dataset can be filtered:

**Note that by default, missing values in the filtering variable are retained (which differs from the default behaviour in R)**. To filter missing records, please set the keepNA parameter to FALSE.

Usage

1
2
3
4
5
6
7
8
9
filterData(
  data,
  filters,
  keepNA = TRUE,
  returnAll = FALSE,
  verbose = FALSE,
  labelVars = NULL,
  labelData = "data"
)

Arguments

data

Data.frame with data.

filters

Unique filter or list of filters. Each filter should be a list containing:

  • 'var': String with variable from data to filter on.

  • 'value': (optional) Character vector with values from var to consider.

  • 'valueFct': (optional) Function (or string wit this function) to be applied on var to extract value to consider

  • 'op': (optional) String with operator used to retain records from value. If not specified, the inclusion operator: '%in%' is considered, a.k.a records with var in value are retained.

  • 'rev': (optional) Logical, if TRUE (FALSE by default), filtering condition based on value/valueFct is reversed.

  • 'keepNA': (optional) Logical, if TRUE (by default), missing values in var are retained. If not specified, keepNA general parameter is used.

  • 'varsBy': (optional) Character vector with variables in data containing groups to filter by

  • 'varNew': (optional) String with name for the new variable created

  • 'labelNew': (optional) String with label for varNew

If a list of filters is specified, the logical operator (see Logic) linking the different conditions can be specified between the two conditions, e.g.: list(list(var = "SEX", value = "F"), "&", list(var = "COUNTRY", value = "DEU")).

keepNA

Logical, if TRUE (by default) missing values in var are retained. If set to FALSE, missing values are ignored for all filters. The specification within filters prevails on this parameter.

returnAll

Logical:

  • if FALSE (by default): the data for only the filtered records is returned.

  • if TRUE: the full data is returned. Records are flagged based on the filters condition, in a new column: varNew (if specified), or 'keep' otherwise; containing TRUE if the record fulfill all conditions, FALSE otherwise

verbose

Logical, if TRUE (FALSE by default) progress messages are printed in the current console. For the visualizations, progress messages during download of subject-specific report are displayed in the browser console.

labelVars

Named character vector containing variable labels.

labelData

(optional) String with label for input data, that will be included in progress messages.

Value

Filtered data if returnAll is FALSE (by default). Otherwise data with additional column: keep or varNew (if specified), containing TRUE for records which fullfill the specified condition(s) and FALSE otherwise.

Author(s)

Laure Cougnaud

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
library(clinUtils)

data(dataADaMCDISCP01)
labelVars <- attr(dataADaMCDISCP01, "labelVars")

dataDM <- dataADaMCDISCP01$ADSL

## single filter

# filter with inclusion criteria:
filterData(
	data = dataDM, 
	filters = list(var = "SEX", value = "M"), 
	verbose = TRUE
)

# filter with non-inclusion criteria
filterData(
	data = dataDM, 
	filters = list(var = "SEX", value = "M", rev = TRUE), 
	verbose = TRUE
)

# filter based on inequality operator
filterData(
	data = dataDM, 
	filters = list(var = "AGE", value = 75, op = "<="), 
	verbose = TRUE
)

# missing values are retained by default!
dataDMNA <- dataDM
dataDMNA[1 : 2, "AGE"] <- NA
filterData(
	data = dataDMNA, 
	filters = list(var = "AGE", value = 75, op = "<="), 
	verbose = TRUE
)

# filter missing values on variable
filterData(
	data = dataDMNA, 
	filters = list(var = "AGE", value = 75, op = "<=", keepNA = FALSE), 
	verbose = TRUE
)

# retain only missing values
filterData(
	data = dataDMNA, 
	filters = list(var = "AGE", value = NA, keepNA = TRUE), 
	verbose = TRUE
)

# filter missing values
filterData(
	data = dataDMNA, 
	filters = list(var = "AGE", keepNA = FALSE), 
	verbose = TRUE
)


## multiple filters

# by default the records fulfilling all conditions are retained ('AND')
filterData(
	data = dataDM, 
	filters = list(
		list(var = "AGE", value = 75, op = "<="),
		list(var = "SEX", value = "M")
	), 
	verbose = TRUE
)

# custom operator:
filterData(
	data = dataDM, 
	filters = list(
		list(var = "AGE", value = 75, op = "<="),
		"|",
		list(var = "SEX", value = "M")
	), 
	verbose = TRUE
)

clinDataReview documentation built on July 14, 2021, 5:08 p.m.