calc: Compute metrics from record (e.g. vital stats) or survey data
In PHSKC-APDE/rads: Assisted computation of King County public health data

View source: R/calc.R

calc	R Documentation

Compute metrics from record (e.g. vital stats) or survey data

Description

Compute metrics from record (e.g. vital stats) or survey data

Usage

calc(ph.data, ...)

## S3 method for class 'dtsurvey'
calc(
  ph.data,
  what = NULL,
  where,
  by = NULL,
  metrics = c("mean", "numerator", "denominator"),
  per = NULL,
  win = NULL,
  time_var = NULL,
  proportion = FALSE,
  fancy_time = TRUE,
  ci = 0.95,
  verbose = FALSE,
  ...
)

Arguments

`ph.data`	data.table or tbl_svy. Dataset.
`...`	not implemented
`what`	character vector. Variable to calculate metrics for.
`where`	subsetting expression
`by`	character vector. Must refer to variables within ph.data. The variables within ph.data to compute `what` by
`metrics`	character. See `metrics` or scroll below for the available options.
`per`	integer. The denominator when "rate" or "adjusted-rate" are selected as the metric. Metrics will be multiplied by this value.
`win`	integer. The number of consecutive units of time (e.g., years, months, etc.) over which the metrics will be calculated, i.e., the 'window' for a rolling average, sum, etc.
`time_var`	character. The name of the time variable in the dataset. Used in combination with the "win" argument to do time windowed calculations.
`proportion`	logical. For survey data, should metrics be calculated assuming the output is proportion-like? See details for more. Currently does not have functionality for non-survey data.
`fancy_time`	logical. If TRUE, a record of all the years going into the data is provided. If FALSE, just a simple range (where certain years within the range might not be represented in your data).
`ci`	numeric. Confidence level, >0 & <1, typically 0.95
`verbose`	logical. Mostly unused, but toggles on/off printed warnings.

Details

This function calculates metrics for each variable in what from rows meeting the conditions specified by where for each grouping implied by by.

Available metrics include:

total: Count of people with the given value. Mostly relevant for surveys (where total is approximately mean * sum(pweights)). Returns total, total_se, total_upper, total_lower. total_se, total_upper, & total_lower are only valid for survey data. Default ci (e.g. upper and lower) is 95 percent.
mean: Average response and associated metrics of uncertainty. Returns mean, mean_se, mean_lower, mean_upper. Default ci (e.g. upper and lower) is 95 percent.
rse: Relative standard error. 100*se/mean.
numerator: Sum of non-NA values for 'what“. The numerator is always unweighted.
denominator: Number of rows where what is not NA. The denominator is always unweighted.
obs: Number of unique observations (i.e., rows), agnostic as to whether there is missing data for what. The obs is always unweighted.
median: The median non NA response. Not populated when what is a factor or character. Even for surveys, the median is the unweighted result.
unique.time: Number of unique time points (from time_var) included in each tabulation (i.e., number of unique time points when the what is not missing).
missing: Number of rows in a given grouping with an NA value for what. missing + denominator = Number of people in a given group. When what is a factor/character, the missing information is provided for the other.
missing.prop: The proportion of the data that has an NA value for what.
rate: mean * per. Provides rescaled mean estimates (i.e., per 100 or per 100,0000). Returns rate, rate_se, rate_lower, rate_upper. Default ci (e.g. upper and lower) is 95 percent.

For survey data, use the proportion argument where relevant to ensure metrics are calculated using special proportion (e.g svyciprop) methods. That is, when you want to find the fraction of ____, toggle proportion to TRUE.

Value

a data.table containing the results

References

https://github.com/PHSKC-APDE/rads/wiki/calc

Examples


#record data
test.data <- get_data_birth(
               year = 2015:2017,
               cols = c("chi_year", "kotelchuck",
                        "chi_sex", "fetal_pres"))

test.results <- calc(test.data,
                     what = c("kotelchuck", "fetal_pres"),
                     chi_year == 2016 & chi_sex %in% c('Male', 'Female'),
                      by = c("chi_year", "chi_sex"),
                      metrics = c("mean", "numerator", "denominator",
                                  "total"))

print(test.results)

PHSKC-APDE/rads documentation built on April 14, 2025, 10:47 a.m.