dt.select: dt.select

Description Usage Arguments Source Examples

Description

This provides more flexibility in selecting the rows to include such as first.k, last.k, or specific row by setting up the parameters.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
dt.select(
  dat,
  the.variables = ".",
  the.filter = NULL,
  grouping.variables = NULL,
  grouping.type = "keyby",
  first.k = NULL,
  last.k = NULL,
  row.indices = NULL
)

Arguments

dat

A data.frame object.

the.variables

A character vector specifying the variables that we want to apply a function to. Only values that exist in names(dat) will be used; other values in the.variables will be excluded from the calculation. When the.variables includes ".", then all values in names(dat) will be selected. Values of the.variables that also exist in grouping.variables will be excluded from the.variables (but grouped by these values).

the.filter

A character value, logical value, or expression stating the logical operations to be performed in filtering the data prior to calculating the.function.

grouping.variables

A character vector specifying variables to group by in performing the computation. Only values that exist in names(dat) will be used.

grouping.type

A character value specifying whether the grouping should be sorted (keyby) or as is (by). Defaults to keyby unless "by" is specified.

first.k

An integer indicating how many rows to select starting from the first row. Note that grouping statements will select up to this number of rows in each group. Additionally, if first.k is larger than the number of records in a group, then the maximum number of records will be selected. When non-integer or non-positive values of first.k are selected, the algorithm will select first.k = max(c(1, round(first.k))). If first.k is not a numeric or integer value, then by default first.k is set to select all of the rows. Specifying row.indices takes precedence to specifying the parameter first.k; if row.indices is not NULL, then row.indices will be used, and first.k will not. Meanwhile, first.k takes precedence to last.k when both are specified. See below.

last.k

An integer indicating how many rows to select starting from the last row. Note that grouping statements will select up to this number of rows in each group. Additionally, if last.k is larger than the number of records in a group, then the maximum number of records will be selected. When non-integer or non-positive values of last.k are selected, the algorithm will select last.k = max(c(1, round(last.k))). If last.k is not a numeric or integer value, then by default last.k is set to select all of the rows. Specifying row.indices takes precedence to specifying the parameter last.k (see below); if row.indices is not NULL, then it will be used, and last.k will not. Meanwhile, first.k takes precedence to last.k when both are specified.

row.indices

An integer vector specifying the row indices to return. When grouping.variables is specified, these indices will be applied to each group. Note that specifications outside of the range from 1 to the number of rows will be limited to existing rows from the data and group. Specifying row.indices takes precedence to specifying the parameters first.k and last.k. If row.indices is not NULL, it will be used.

Source

DTwrappers::create.filter.expression

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
id.name = 'User ID'
awareness.name = 'Awareness'
consideration.name = 'Consideration'
consumption.name = 'Consumption'
satisfaction.name = 'Satisfaction'
advocacy.name = 'Advocacy'
gender.name = 'Gender'

dt.select(dat = formulaic::snack.dat,
         the.variables = c(id.name, awareness.name))
dt.select(
 dat = formulaic::snack.dat,
 the.filter = "Age > 65 &
          Region == 'Northeast' & Product == 'Tiramisoup' &
          Awareness == 1",
 the.variables = c(
   consideration.name,
   consumption.name,
   satisfaction.name,
   advocacy.name
 ),
 grouping.variables = c(gender.name)
)

dt.select(
 dat = formulaic::snack.dat,
 the.filter = "Age > 65 &
          Region == 'Northeast' & Product == 'Tiramisoup' &
          Awareness == 1",
 the.variables = c(
   consideration.name,
   consumption.name,
   satisfaction.name,
   advocacy.name
 ),
 grouping.variables = c(gender.name),
 first.k = 2
)

dt.select(
 dat = formulaic::snack.dat,
 the.filter = "Age > 65 &
         Region == 'Northeast' & Product == 'Tiramisoup' &
         Awareness == 1",
 the.variables = c(
   consideration.name,
   consumption.name,
   satisfaction.name,
   advocacy.name
 ),
 grouping.variables = c(gender.name),
 last.k = 2
)
         
dt.select(
dat = formulaic::snack.dat,
 the.filter = "Age > 65 & Region == 'Northeast' &
          Product == 'Tiramisoup' & Awareness == 1",
 the.variables = c(
   consideration.name,
   consumption.name,
   satisfaction.name,
   advocacy.name
 ),
 grouping.variables = c(gender.name),
 first.k = 2,
 last.k = 2
)

dt.select(
 dat = formulaic::snack.dat,
 the.filter = "Age > 65 & Region == 'Northeast' & Product == 'Tiramisoup' & Awareness == 1",
 the.variables = c(
   consideration.name,
   consumption.name,
   satisfaction.name,
   advocacy.name
 ),
 grouping.variables = c(gender.name),
 row.indices = 7:9
)

dachosen1/DTwrappers documentation built on Dec. 25, 2019, 8:04 a.m.