get_stats_data: Get statistics data

View source: R/get_numeric.R

get_stats_dataR Documentation

Get statistics data

Description

Like the layer numeric data, Tplyr also stores the numeric data produced from statistics like risk difference. This helper function gives you access to obtain that data from the environment

Usage

get_stats_data(x, layer = NULL, statistic = NULL, where = TRUE, ...)

Arguments

x

A tplyr_table or tplyr_layer object

layer

Layer name or index to select out specifically

statistic

Statistic name or index to select

where

Subset criteria passed to dplyr::filter

...

Additional arguments passed to dispatch

Details

When used on a tplyr_table object, this method will aggregate the numeric data from all Tplyr layers and calculate all statistics. The data will be returned to the user in a list of data frames. If the data has already been processed (i.e. build has been run), the numeric data is already available and the statistic data will simply be returned. Otherwise, the numeric portion of the layer will be processed.

Using the layer, where, and statistic parameters, data for a specific layer statistic can be extracted and subset, allowing you to directly access data of interest. This is most clear when layers are given text names instead of using a layer index, but a numeric index works as well. If just a statistic is specified, that statistic will be collected and returned in a list of data frames, allowing you to grab, for example, just the risk difference statistics across all layers.

Value

The statistics data of the supplied layer

Examples

library(magrittr)

t <- tplyr_table(mtcars, gear) %>%
  add_layer(name='drat',
            group_desc(drat)
  ) %>%
  add_layer(name="cyl",
            group_count(cyl)
  ) %>%
  add_layer(name="am",
            group_count(am) %>%
              add_risk_diff(c('4', '3'))
  ) %>%
  add_layer(name="carb",
            group_count(carb) %>%
              add_risk_diff(c('4', '3'))
  )

 # Returns a list of lists, containing stats data from each layer
 get_stats_data(t)

 # Returns just the riskdiff statistics from each layer - NULL
 # for layers without riskdiff
 get_stats_data(t, statistic="riskdiff")

 # Return the statistic data for just the "am" layer - a list
 get_stats_data(t, layer="am")
 get_stats_data(t, layer=3)

 # Return the statistic data for just the "am" and "cyl", layer - a
 # list of lists
 get_stats_data(t, layer=c("am", "cyl"))
 get_stats_data(t, layer=c(3, 2))

 # Return just the statistic data for "am" and "cyl" - a list
 get_stats_data(t, layer=c("am", "cyl"), statistic="riskdiff")
 get_stats_data(t, layer=c(3, 2), statistic="riskdiff")


 # Return the riskdiff for the "am" layer - a data frame
 get_stats_data(t, layer="am", statistic="riskdiff")

 # Return and filter the riskdiff for the am layer - a data frame
 get_stats_data(t, layer="am", statistic="riskdiff", where = summary_var==1)


Tplyr documentation built on May 29, 2024, 10:37 a.m.