subgroup: subgroup

Description Usage Arguments Details Value See Also Examples

View source: R/tree_functions.R

Description

Subset a user-provided data.frame according to the subgroup specified by a node in a tree.

Usage

1
subgroup(splits, node, xdata, ydata = xdata)

Arguments

splits

A data.frame of splits returned from a call to parse_rpart().

node

The NodeID of the node defining the desired split.

xdata

The data.frame of covariates to subset according to the subgroup definition.

ydata

The associated vector of response values to subset according to the subgroup definition. (optional)

Details

After the splits from an rpart.object are extracted by a call to parse_rpart(), the extracted splits define a subgroup for each node. This subgroup can be used to subset a user-provided data.frame. This function takes as its input a data.frame of splits obtained from a call to parse_rpart(), a NodeID indicating which node specifies the desired subgroup, a data.frame of covariates to subset, and (optionally) the associated response data to subset. If only xdata is specified by the user, the subset of xdata implied by the subgroup will be returned. If xdata and ydata are provided by the user, the subset of ydata will be returned (xdata is still required from the user because the subsetting is computed on the covariate values even when the data returned to the user are from ydata).

Value

A data.frame containing the data consistent with the specified subgroup.

See Also

parse_rpart, rpart, rpart.object

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
requireNamespace( "rpart", quietly = TRUE )

## Generate example data containing response, treatment, and covariates
N <- 20
continuous_response = runif( min = 0, max = 20, n = N )
trt <- sample( c('Control','Experimental'), size = N, prob = c(0.4,0.6), replace = TRUE )
X1 <- runif( N, min = 0, max = 1 )
X2 <- runif( N, min = 0, max = 1 )
X3 <- sample( c(0,1), size = N, prob = c(0.2,0.8), replace = TRUE )
X4 <- sample( c('A','B','C'), size = N, prob = c(0.6,0.3,0.1), replace = TRUE )

covariates <- data.frame( trt )
names( covariates ) <- "trt"
covariates$X1 <- X1
covariates$X2 <- X2
covariates$X3 <- X3
covariates$X4 <- X4

## Fit an rpart model
fit <- rpart::rpart( continuous_response ~ trt + X1 + X2 + X3 + X4 )

## Return parsed splits with subgroups
splits1 <- parse_rpart( fit, include_subgroups = TRUE )
splits1

## Subset covariate data according to split for NodeID 3
ex1 <- subgroup( splits = splits1, node = 3, xdata = covariates )
ex1

## Subset response data according to split for NodeID 3
ex2 <- subgroup( splits = splits1, node = 3, xdata = covariates, ydata = continuous_response )
ex2

TSDT documentation built on May 2, 2019, 6:40 a.m.