#' Calculate Slope Index of Inequality using phe_sii
#'
#' @description
#' `phe_sii()`returns the slope index of inequality (SII) statistic for each
#' subgroup of the inputted dataframe, with lower and upper confidence limits
#' based on the specified confidence.
#'
#' @details
#'
#' The Relative Index of Inequality (RII) can also be returned via an optional
#' argument.
#'
#' The SII and RII are two measures of health inequality. They show the relation
#' between the level of health or frequency of a health problem in different
#' population groups and the ranking of these groups on the social scale.
#'
#' The input dataframe should be grouped before passing to the function if an
#' SII/RII for each subgroup is required, and quantiles ordered from least to
#' most advantaged.
#'
#' @section Calculation:
#'
#' The SII is calculated using linear regression (1). To allow for differences
#' in population size between quantiles (e.g. deprivation deciles), each is
#' given a rank score (or relative rank) based on the midpoint of its range in
#' the cumulative distribution of the total area population. The quantiles are
#' first ordered (e.g from 1 most deprived to 10 least deprived for deprivation
#' deciles). If quantile 1 then contains 12 percent of the total population, its
#' relative rank is \code{0.12/2=0.6}. If quantile 2 includes 10 percent of the
#' population, its relative rank is \code{0.12+(0.10/2)=0.17}. A square root
#' transformation is applied to the regression to account for heteroskedasticity
#' (the tendancy for the variances of the quantile values to be related to the
#' size of the values, ie larger values will tend to have larger variances). A
#' regression model is fitted to the transformed data: \eqn{Y * \sqrt a = \sqrt
#' a + b * \sqrt a}, where Y is the value of the indicator for the quantile, a
#' is the proportion of the total population in the quantile and b is the
#' relative rank. The SII is the gradient of the resulting fitted line, and
#' could be positive or negative according to the indicator polarity. Since the
#' relative ranks, by definition, range from 0 to 1, the SII is the difference
#' between the fitted value at \code{x=1} and \code{x=0}. The RII is the ratio
#' of the fitted value at \code{x=1,Y1} and the fitted value at \code{x=0,Y0}.
#' which can be calculated as: \code{RII = (Y0 + SII)/Y0}
#'
#' @section Transformations:
#'
#' The indicator type can be specified as 1 (rate), 2 (proportion) or 0 (other),
#' using the \code{value_type} parameter. This setting determines the data
#' transformations that will be applied in the following two parts of the
#' method.
#'
#' Use in conjunction with the \code{transform} parameter in calculation of the
#' SII: It is recommended that rates and proportions are transformed prior to
#' calculation of the SII by setting the \code{transform} parameter to TRUE for
#' these indicator types. This will perform a log transformation for rates, or
#' logit for proportions, and return outputs transformed back to the original
#' units of the indicator. These transformations are recommended to improve the
#' linearity between the indicator values and the quantile, which is an
#' assumption of the method. A user-provided standard error will not be accepted
#' when the \code{transform} parameter is set to TRUE as the confidence limits
#' are required for this transformation.
#'
#' Use in calculation of the standard error: Rates and proportions, and their
#' confidence limits, are transformed prior to calculation of the standard error
#' for each quantile. This is because it is assumed that the confidence interval
#' around the indicator value is non-symmetric for these indicator types. Note
#' that this transformation is not controlled by the \code{transform} parameter
#' and is applied based on the value of the \code{value_type} parameter only. A
#' user-provided standard error will not be accepted when the \code{transform}
#' parameter is set to TRUE as the confidence limits are required for this
#' transformation.
#'
#' @section Warning:
#'
#' The SII calculation assumes a linear relationship between indicator value and
#' quantile. Where this is not the case the transform option should be considered.
#' Small populations within quantiles can make the SII unstable. This
#' function does not include checks for linearity or stability; it is the user's
#' responsibility to ensure the input data is suitable for the SII calculation.
#'
#' @param data data.frame containing the required input fields, pre-grouped if
#' an SII is required for each subgroup; unquoted string; no default
#' @param quantile field name within data that contains the quantile label (e.g.
#' decile). The number of quantiles should be between 5 and 100; unquoted
#' string; no default
#' @param population field name within data that contains the quantile
#' populations (ie, denominator). Non-zero populations are required for all
#' quantiles to calculate SII for an area; unquoted string; no default
#' @param x (for indicators that are proportions) field name within data that
#' contains the members of the population with the attribute of interest (ie,
#' numerator). This will be divided by population to calculate a proportion as
#' the indicator value (if value field is not provided); unquoted string; no
#' default
#' @param value field name within data that contains the indicator value (this
#' does not need to be supplied for proportions if count and population are
#' given); unquoted string; no default
#' @param value_type indicates the indicator type (1 = rate, 2 = proportion, 0 =
#' other). The \code{value_type} argument is used to determine whether data should
#' be transformed prior to calculation of the standard error and/or SII. See
#' the \code{Tansformations} section for full details; integer; default 0
#' @param transform option to transform input rates or proportions prior to
#' calculation of the SII. See the \code{Transformations} section for full
#' details; logical; default FALSE
#' @param lower_cl field name within data that contains 95 percent lower
#' confidence limit of indicator value (to calculate standard error of
#' indicator value). This field is needed if the se field is not supplied;
#' unquoted string; no default
#' @param upper_cl field name within data that contains 95 percent upper
#' confidence limit of indicator value (to calculate standard error of
#' indicator value). This field is needed if the se field is not supplied;
#' unquoted string; no default
#' @param se field name within data that contains the standard error of the
#' indicator value. If not supplied, this will be calculated from the 95
#' percent lower and upper confidence limits (i.e. one or the other of these
#' fields must be supplied); unquoted string; no default
#' @param multiplier factor to multiply the SII and SII confidence limits by
#' (e.g. set to 100 to return prevalences on a percentage scale between 0 and
#' 100). If the multiplier is negative, the inverse of the RII is taken to
#' account for the change in polarity; numeric; default 1;
#' @param repetitions number of random samples to perform to return confidence
#' interval of SII (and RII). Minimum is 1000, no maximum (though the more
#' repetitions, the longer the run time); numeric; default 100,000
#' @param confidence confidence level used to calculate the lower and upper
#' confidence limits of SII, expressed as a number between 0.9 and 1, or 90
#' and 100. It can be a vector of 0.95 and 0.998, for example, to output both
#' 95 percent and 99.8 percent CIs; numeric; default 0.95
#' @param rii option to return the Relative Index of Inequality (RII) with
#' associated confidence limits as well as the SII; logical; default FALSE
#' @param intercept option to return the intercept value of the regression line
#' (y value where x=0); logical; default FALSE
#' @param reliability_stat option to carry out the SII confidence interval
#' simulation 10 times instead of once and return the Mean Average Difference
#' between the first and subsequent samples (as a measure of the amount of
#' variation). Warning: this will significantly increase run time of the
#' function and should first be tested on a small number of repetitions;
#' logical; default FALSE
#' @param type "full" output includes columns in the output dataset specifying
#' the parameters the user has input to the function (value_type, multiplier,
#' CI_confidence, CI_method); character string either "full" or "standard";
#' default "full"
#'
#' @references
#' (1) Low A & Low A. Measuring the gap: quantifying and comparing local health inequalities.
#' Journal of Public Health; 2004;26:388-395. \cr \cr
#'
#' @import dplyr
#' @import broom
#' @importFrom rlang quo_text
#' @importFrom purrr map
#' @importFrom tidyr nest unnest spread
#' @importFrom stats rnorm qnorm lm
#' @importFrom tidyselect where
#' @importFrom rlang := .data .env
#'
#' @examples
#' library(dplyr)
#'
#' data <- data.frame(area = c(rep("Area1", 10), rep("Area2", 10)),
#' decile = c(1:10, 1:10),
#' population = c(7291, 7997, 6105, 7666, 5790, 6934, 5918, 5974, 7147, 7534, 21675,
#' 20065, 19750, 24713, 20112, 19618, 22408, 19752, 18939, 19312),
#' value = c(75.9, 78.3, 83.8, 83.6, 80.5, 81.1, 81.7, 84.2, 80.6, 86.3, 70.5,
#' 71.6, 72.5, 73.5, 73.1, 76.2, 78.7, 80.6, 80.9, 80),
#' lowerCL = c(72.7,75.3,80.9,80.2,77.1,78,79,81.4,75.8,83.2,
#' 70.1,71.1,72,73.1, 72.7, 75.7, 78.2,80.1,80.4,79.5),
#' upperCL = c(79.1,81.4,86.8,87.1,83.8,84.2,84.4,86.9,85.4,
#' 89.4,71,72.1,73.2,73.7,75.8,78.8,79.8,81.2,81.3,80.9),
#' StandardError = c(1.64,1.58,1.51,1.78,1.7,1.56,1.37,1.4,2.43,
#' 1.57,0.23,0.26,0.3,0.16,0.79,0.78,0.4,0.28,0.23,0.35)
#' )
#'
#'
#' # Run SII function on the two areas in the data
#' phe_sii(group_by(data, area),
#' decile,
#' population,
#' value_type = 0, # default normal distribution
#' value = value,
#' lower_cl = lowerCL,
#' upper_cl = upperCL,
#' confidence = 0.95,
#' rii = TRUE,
#' type = "standard")
#'
#' # Supplying the standard error instead of the indicator 95 percent confidence limits
#' # gives the same result
#' phe_sii(group_by(data, area),
#' decile,
#' population,
#' value_type = 0,
#' value = value,
#' se = StandardError,
#' confidence = 0.95,
#' rii = TRUE,
#' type = "standard")
#'
#' # multiple confidence intervals, log transforming the data if they are rates
#' phe_sii(group_by(data, area),
#' decile,
#' population,
#' value_type = 1,
#' transform = TRUE,
#' value = value,
#' lower_cl = lowerCL,
#' upper_cl = upperCL,
#' confidence = c(0.95, 0.998),
#' repetitions = 10000,
#' rii = TRUE,
#' type = "standard")
#'
#' @export
#'
#' @return The SII with lower and upper confidence limits for each subgroup of
#' the inputted data.frame.
#'
#' @family PHEindicatormethods package functions
# -------------------------------------------------------------------------------------------------
phe_sii <- function(data, quantile, population, # compulsory fields
x = NULL, # optional fields
value = NULL,
value_type = 0,
transform = FALSE,
lower_cl = NULL,
upper_cl = NULL,
se = NULL,
multiplier = 1,
repetitions = 100000,
confidence = 0.95,
rii = FALSE,
intercept = FALSE,
reliability_stat = FALSE,
type = "full") {
# Part 1 - Checks on input data ---------------------------------------------
if (missing(data)| missing(quantile)| missing(population)) {
stop("function phe_sii requires the arguments: data, quantile, population")
}
if (missing(value) & missing(x)) {
stop("function phe_sii requires value field, or x field if indicator is a proportion of population")
}
if (missing(se) & (missing(upper_cl) | missing(lower_cl))) {
stop("function phe_sii requires either lower_cl and upper_cl fields, or se field")
}
if (!(value_type %in% c(0,1,2))) {
stop("value_type should be 0, 1 or 2")
}
if (!(class(multiplier) %in% c("numeric", "integer") & class(repetitions) %in% c("numeric", "integer") & class(confidence) %in% c("numeric", "integer"))) {
stop("multiplier, repetitions and confidence inputs should be numeric")
}
if (repetitions < 1000) {
stop("number of repetitions must be 1000 or greater. Default is 100,000")
}
# if transform is true then value type must be rate or proportion
if (transform == TRUE & value_type == 0) {
stop("value_type should be 1 or 2 when transform is true")
}
# if transform is true then se cannot be provided
if (transform == TRUE & !(missing(se))) {
stop("function phe_sii requires se to be missing when transform is true")
}
# if transform is true then upper and lower cls must be provided
if (transform == TRUE & (missing(upper_cl) | missing(lower_cl))) {
stop("function phe_sii requires lower_cl and upper_cl fields when transform is true")
}
# check on confidence limit requirements
if (any(confidence < 0.9) | (any(confidence > 1) & any(confidence < 90)) | any(confidence > 100)) {
stop("all confidence levels must be between 90 and 100 or between 0.9 and 1")
}
# Use NSE on inputs - apply quotes
quantile = enquo(quantile)
population = enquo(population)
if(!missing(x)) {x = enquo(x)}
if(!missing(value)) {value = enquo(value)}
if(!missing(se)) {se = enquo(se)}
if(!missing(lower_cl)) {lower_cl = enquo(lower_cl)}
if(!missing(upper_cl)) {upper_cl = enquo(upper_cl)}
# scale confidence level
confidence[confidence >= 90] <- confidence[confidence >= 90] / 100
# check for non numeric inputs
if(!(class(pull(data, {{ population }})) %in% c("numeric", "integer")
& ifelse(rlang::quo_text(x) %in% names(data), (class(pull(data, {{ x }})) %in% c("numeric", "integer")), TRUE)
& ifelse(rlang::quo_text(value) %in% names(data), (class(pull(data, {{ value }})) %in% c("numeric", "integer")), TRUE)
& ifelse(rlang::quo_text(se) %in% names(data), (class(pull(data, {{ se }})) %in% c("numeric", "integer")), TRUE)
& ifelse(rlang::quo_text(lower_cl) %in% names(data), (class(pull(data, {{ lower_cl }})) %in% c("numeric", "integer")), TRUE)
& ifelse(rlang::quo_text(upper_cl) %in% names(data), (class(pull(data, {{ upper_cl }})) %in% c("numeric", "integer")), TRUE))) {
stop("some input fields in data.frame are non-numeric")
}
# check for zero or negative populations
negative_pops <- data %>%
filter({{ population }} <= 0 | is.na({{ population }}))
if (nrow(negative_pops) > 0) {
stop("some groups have a zero, negative or missing population")
}
# check for negative/missing standard errors
if(rlang::quo_text(se) %in% names(data)) {
negative_se <- data %>%
filter({{ se }} < 0 | is.na({{ se }}))
if (nrow(negative_se) > 0) {
stop("negative or missing standard errors in input dataset")
}
}
# check for missing confidence limits
if(rlang::quo_text(lower_cl) %in% names(data) & rlang::quo_text(upper_cl) %in% names(data)) {
negative_cl <- data %>%
filter(is.na({{ lower_cl }}) | is.na({{ upper_cl }}))
if (nrow(negative_cl) > 0) {
stop("missing lower or upper confidence limits in input dataset")
}
}
# checks on PROPORTIONS
if(value_type == 2) {
# check for proportions outside (0,1) range
if(rlang::quo_text(value) %in% names(data)) {
invalid_prop <- data %>%
filter({{ value }} < 0 | {{ value }} > 1)
if (nrow(invalid_prop) > 0) {
stop("value proportions are not all between 0 and 1")
}
}
# check for lower and upper CLs outside (0,1) range
if(rlang::quo_text(lower_cl) %in% names(data) &
rlang::quo_text(upper_cl) %in% names(data)) {
invalid_prop_cl <- data %>%
filter({{ lower_cl }} < 0 | {{ lower_cl }} > 1 |
{{ upper_cl }} < 0 | {{ upper_cl }} > 1)
if (nrow(invalid_prop_cl) > 0) {
stop("confidence limit proportions are not all between 0 and 1")
}
}
# check for zero or negative counts
if(!(rlang::quo_text(value) %in% names(data)) &
rlang::quo_text(x) %in% names(data)) {
negative_x <- data %>%
filter({{ x }} <= 0)
if (nrow(negative_x) > 0) {
stop("some groups have a zero or negative count x")
}
}
}
# Part 2 - Start calculations ---------------------------------------------
# extract grouping variables of input dataset (if any)
grouping_variables <- group_vars(data)
# Convert factors to character
data <- data %>%
ungroup() %>%
mutate(across(where(is.factor), as.character)) %>%
group_by(!!! syms(c(grouping_variables)))
# Extract vector of quantiles and save the number to "no_quantiles"
quantile_list <- unique(select(ungroup(data), {{ quantile }}))
no_quantiles <- nrow(quantile_list)
# Output warning on number of quantiles inputted
if (no_quantiles < 5 | no_quantiles > 100) {
stop("Number of quantiles must be between 5 and 100")
} else if (no_quantiles > 10) {
warning("WARNING: Small values can make SII unstable when using a large number of quantiles")
}
# Remove records with missing essential data
if (rlang::quo_text(se) %in% names(data)) {
valid_complete <- data %>%
filter({{ population }} > 0,
!is.na({{ se }}))
} else if (rlang::quo_text(lower_cl) %in% names(data) &
rlang::quo_text(upper_cl) %in% names(data)) {
valid_complete <- data %>%
filter({{ population }} > 0,
!is.na({{ lower_cl }}), !is.na({{ upper_cl }}))
}
# Not all quantiles may have data for each grouping
# Start by counting the number of quantiles each area has data for -
# exclude any areas with missing data (SII cannot be calculated)
valid_areas <- valid_complete %>%
summarise(n = length(unique({{ quantile }}))) %>%
filter(n == no_quantiles)
# Create table of areas to calculate SII for
valid_deciles <- valid_areas %>%
merge(quantile_list, # Merge on list of quantiles
all.x = TRUE,
all.y = TRUE)
if (nrow(valid_deciles) != nrow(data)) {
warning("WARNING: some records have been removed due to incomplete or invalid data")
}
# join provided data to valid decile table
pops_prep <- left_join(valid_deciles, data,
by = c(grouping_variables, rlang::quo_text(quantile))) %>%
group_by(!!! syms(c(grouping_variables, rlang::quo_text(quantile)))) %>%
arrange(!!! syms(c(grouping_variables, rlang::quo_text(quantile))))
# Calculate indicator value (if not supplied in input data) as proportion for each
# quantile (x/population)
if (rlang::quo_text(value) %in% names(pops_prep)) {
pops_prep <- mutate(pops_prep, value = {{ value }})
} else if (value_type == 2) {
pops_prep <- mutate(pops_prep, value = {{ x }} / {{ population }})
}
# Transform value if value is a rate or proportion
pops_prep <- pops_prep %>%
mutate(value = ifelse(value_type == 1,
log(.data$value),
ifelse(value_type == 2,
log(.data$value / (1 - .data$value)),
.data$value)))
# Transform lower and upper confidence limits in the case of a rate or proportion
if (rlang::quo_text(lower_cl) %in% names(pops_prep) &
rlang::quo_text(upper_cl) %in% names(pops_prep)) {
pops_prep <- pops_prep %>%
mutate(lower_cl = ifelse(value_type == 0, {{ lower_cl }},
ifelse(value_type == 1, log({{ lower_cl }}),
ifelse(value_type == 2, log({{ lower_cl }} / (1 - {{ lower_cl }})),
NA))),
upper_cl = ifelse(value_type == 0, {{ upper_cl }},
ifelse(value_type == 1, log({{ upper_cl }}),
ifelse(value_type == 2, log({{ upper_cl }} / (1 - {{ upper_cl }})),
NA))))
}
# Calculate standard error (if not supplied in input data), from lower and upper CLs
z <- stats::qnorm(0.975) # hard-coded at 95% confidence
if (rlang::quo_text(se) %in% names(pops_prep)) {
pops_prep <- mutate(pops_prep,
se_calc = {{ se }})
} else {
pops_prep <- mutate(pops_prep,
se_calc = (upper_cl - lower_cl) / z / 2)
}
# Calculate a and b vals
pops_prep_ab <- pops_prep %>%
group_by(!!! syms(grouping_variables)) %>%
mutate(a_vals = {{ population }}/ sum({{ population }}), # Proportion of total population of subgroup
b_vals = FindXValues({{ population }}, no_quantiles))
# Calculate sqrt(a), bsqrt(a) and un-transformed y value for regression
if(transform == FALSE) {
pops_prep_ab <- pops_prep_ab %>%
group_by(!!! syms(c(grouping_variables, rlang::quo_text(quantile)))) %>%
mutate(sqrt_a = sqrt(.data$a_vals),
b_sqrt_a = .data$b_vals * .data$sqrt_a,
value_transform = ifelse(value_type == 1, exp(.data$value),
ifelse(value_type == 2,
exp(.data$value) / (1 + exp(.data$value)),
.data$value)),
yvals = .data$sqrt_a * .data$value_transform)
} else {
pops_prep_ab <- pops_prep_ab %>%
group_by(!!! syms(c(grouping_variables, rlang::quo_text(quantile)))) %>%
mutate(sqrt_a = sqrt(.data$a_vals),
b_sqrt_a = .data$b_vals * .data$sqrt_a,
value_transform = .data$value,
yvals = .data$sqrt_a * .data$value_transform)
}
# calculate confidence interval for SII via simulation
# Repeat this 10 times to get a "variability" measure if requested
# Nest data (different argument needed for grouped vs. ungrouped dataset)
if(length(grouping_variables) == 0) {
popsSII_model <- pops_prep_ab %>%
tidyr::nest(data = everything())
} else {
popsSII_model <- pops_prep_ab %>%
group_by(!!! syms(grouping_variables)) %>%
tidyr::nest()
}
# Different nest() argument needed for ungrouped dataset
if(length(grouping_variables) == 0) {
sim_CI <- pops_prep_ab %>%
tidyr::nest(data = everything()) %>%
mutate(CI_params = purrr::map(data, ~ SimulationFunc(data = .,
.data$value,
value_type,
.data$se_calc,
repetitions,
confidence,
multiplier,
.data$sqrt_a,
.data$b_sqrt_a,
rii,
transform,
reliability_stat)))
} else {
sim_CI <- pops_prep_ab %>%
group_by(!!! syms(grouping_variables)) %>%
tidyr::nest() %>%
mutate(CI_params = purrr::map(data, ~ SimulationFunc(data = .,
.data$value,
value_type,
.data$se_calc,
repetitions,
confidence,
multiplier,
.data$sqrt_a,
.data$b_sqrt_a,
rii,
transform,
reliability_stat)))
}
# Perform regression to calculate SII and extract model parameters
popsSII_model <- popsSII_model %>%
# perform linear model
mutate(model = purrr::map(data, function(df)
stats::lm(yvals ~ sqrt_a + b_sqrt_a - 1, data = df))) %>%
# extract model coefficients
mutate(model = purrr::map(.data$model, broom::tidy)) %>%
tidyr::unnest("model") %>%
# remove unecessary fields
select(!c("std.error", "statistic", "p.value")) %>%
# create columns for each parameter
tidyr::pivot_wider(names_from = "term",
values_from = "estimate")
# Format results according to whether transform = T/F
if(transform == FALSE) { #no anti-transform needed
# Extract SII and RII values
popsSII_model <- popsSII_model %>%
mutate(sii = multiplier * .data$b_sqrt_a,
rii = (.data$sqrt_a + .data$b_sqrt_a)/.data$sqrt_a,
intercept = .data$sqrt_a) %>%
# Take inverse of RII if multiplier is negative
mutate(rii = ifelse(multiplier < 0, 1 / rii, rii)) %>%
# Select fields to keep
select(all_of(grouping_variables), "sii", "rii", "intercept")
# join on dataset with SII/ RII confidence limits
# Get CIs from first round of reps
# Unnest confidence limits in a data frame for joining
sim_CI_rep1 <- sim_CI %>%
select(!c("data")) %>%
tidyr::unnest("CI_params") |>
slice_head(n = 1)
if (length(grouping_variables) > 0) {
# (grouped dataset)
popsSII_model <- popsSII_model %>%
left_join(sim_CI_rep1, by = grouping_variables)
} else {
# ungrouped dataset
popsSII_model <- popsSII_model %>%
cbind(sim_CI_rep1)
}
# Add reliability stats
if (isTRUE(reliability_stat)) {
sim_CI <- rename(sim_CI, "CI_calcs" = "CI_params")
reliabity_stats <- calc_reliability(
CI_data = sim_CI,
confidence = confidence,
rii = rii
)
if (length(grouping_variables) > 0) {
# (grouped dataset)
popsSII_model <- popsSII_model %>%
left_join(reliabity_stats, by = grouping_variables)
} else {
# ungrouped dataset
popsSII_model <- popsSII_model %>%
cbind(reliabity_stats)
}
}
} else {
popsSII_model <- popsSII_model %>%
mutate(sii = .data$b_sqrt_a,
intercept = .data$sqrt_a) %>%
# Select fields to keep
select(all_of(grouping_variables), "sii", "intercept")
#Do calculations that can be done outside of loop as they don't need the CI fields
if (value_type == 1) {#anti-log needed
popsSII_model <- popsSII_model %>%
mutate(xequals1 = .data$intercept + .data$sii,
xequalshalf = (.data$intercept + .data$xequals1) / 2,
antilogintercept = exp(.data$intercept),
antilogxequals1 = exp(.data$xequals1),
multiplier = .env$multiplier,
sii = (.data$antilogxequals1 - .data$antilogintercept) * .data$multiplier,
rii = if_else(
.data$multiplier < 0,
1 / (.data$antilogxequals1 / .data$antilogintercept),
.data$antilogxequals1 / .data$antilogintercept
),
intercept = .data$antilogintercept * abs(.data$multiplier))
} else if (value_type == 2) {#anti-logit needed
popsSII_model <- popsSII_model %>%
mutate(xequals1 = .data$intercept + .data$sii,
xequalshalf = (.data$intercept + .data$xequals1) / 2,
antilogintercept = exp(.data$intercept) / (1 + exp(.data$intercept)),
antilogxequals1 = exp(.data$xequals1) / (1 + exp(.data$xequals1)),
multiplier = .env$multiplier,
sii = (.data$antilogxequals1 - .data$antilogintercept) * .data$multiplier,
rii = if_else(
multiplier < 0,
1 / (.data$antilogxequals1 / .data$antilogintercept),
.data$antilogxequals1 / .data$antilogintercept
),
intercept = .data$antilogintercept * abs(.data$multiplier))
}
popsSII_model_CIs <- popsSII_model |>
select(all_of(grouping_variables), "xequalshalf")
popsSII_model <- popsSII_model |>
select(all_of(grouping_variables), "sii", "rii", "intercept")
# join on dataset with confidence limits
if (length(grouping_variables) > 0) {
# (grouped dataset)
popsSII_model_CIs <- popsSII_model_CIs %>%
left_join(sim_CI, by = grouping_variables)
} else {
# ungrouped dataset
popsSII_model_CIs <- popsSII_model_CIs %>%
cbind(sim_CI)
}
# Calculate SII and RII for each rep
popsSII_model_CIs <- popsSII_model_CIs |>
mutate(
CI_calcs = purrr::map2(.data$CI_params, .data$xequalshalf, function(data, xequalshalf) {
map(confidence, function(conf) {
conf_formatted <-
gsub("\\.", "_", formatC(conf * 100, format = "f", digits = 1))
selected_data <- data %>%
select(contains(conf_formatted)) |>
select(contains("sii")) |>
rename(
"sii_lower" = contains("sii_lower"),
"sii_upper" = contains("sii_upper")
)
if (value_type == 1) {
SII_calculations <- selected_data %>%
mutate(interceptlcl = xequalshalf - (.data$sii_lower / 2),
interceptucl = xequalshalf - (.data$sii_upper / 2),
xequals1lcl = xequalshalf + (.data$sii_lower / 2),
xequals1ucl = xequalshalf + (.data$sii_upper / 2),
multiplier = .env$multiplier,
sii_lower = if_else(.data$multiplier < 1, (exp(.data$xequals1ucl) - exp(.data$interceptucl)) * .data$multiplier,
(exp(.data$xequals1lcl) - exp(.data$interceptlcl)) * .data$multiplier),
sii_upper = if_else(.data$multiplier < 1, (exp(.data$xequals1lcl) - exp(.data$interceptlcl)) * .data$multiplier,
(exp(.data$xequals1ucl) - exp(.data$interceptucl)) * .data$multiplier)
)
if (isTRUE(rii)) {
SII_calculations <- SII_calculations |>
mutate(
rii_lower = if_else(
.data$multiplier < 1,
1 / (exp(.data$xequals1ucl) / exp(.data$interceptucl)
),
exp(.data$xequals1lcl) / exp(.data$interceptlcl)),
rii_upper = if_else(
.data$multiplier < 1,
1 / (exp(.data$xequals1lcl) / exp(.data$interceptlcl)),
exp(.data$xequals1ucl) / exp(.data$interceptucl)
)
)
}
} else if (value_type == 2) {
SII_calculations <- selected_data %>%
mutate(interceptlcl = xequalshalf - (.data$sii_lower / 2),
interceptucl = xequalshalf - (.data$sii_upper / 2),
xequals1lcl = xequalshalf + (.data$sii_lower / 2),
xequals1ucl = xequalshalf + (.data$sii_upper / 2),
multiplier = .env$multiplier,
sii_lower = if_else(
.data$multiplier < 0,
((exp(.data$xequals1ucl) / (1 + exp(.data$xequals1ucl))) - (exp(.data$interceptucl) / (1 + exp(.data$interceptucl)))) * .data$multiplier,
((exp(.data$xequals1lcl) / (1 + exp(.data$xequals1lcl))) - (exp(.data$interceptlcl) / (1 + exp(.data$interceptlcl)))) * .data$multiplier
),
sii_upper = if_else(
.data$multiplier < 0,
((exp(.data$xequals1lcl) / (1 + exp(.data$xequals1lcl))) - (exp(.data$interceptlcl) / (1 + exp(.data$interceptlcl)))) * .data$multiplier,
((exp(.data$xequals1ucl) / (1 + exp(.data$xequals1ucl))) - (exp(.data$interceptucl)/(1 + exp(.data$interceptucl)))) * .data$multiplier
)
)
if (isTRUE(rii)) {
SII_calculations <- SII_calculations %>%
mutate(
rii_lower = if_else(
multiplier < 0,
1 / ((exp(.data$xequals1ucl) / (1 + exp(.data$xequals1ucl))) / (exp(.data$interceptucl) / (1 + exp(.data$interceptucl)))),
((exp(.data$xequals1lcl)/(1 + exp(.data$xequals1lcl))) / (exp(.data$interceptlcl)/(1 + exp(.data$interceptlcl))))),
rii_upper = if_else(
.data$multiplier < 0,
1 / ((exp(.data$xequals1lcl) / (1 + exp(.data$xequals1lcl))) / (exp(.data$interceptlcl)/(1 + exp(.data$interceptlcl)))),
((exp(.data$xequals1ucl) / (1 + exp(.data$xequals1ucl))) / (exp(.data$interceptucl)/(1 + exp(.data$interceptucl))))
)
)
}
}
SII_calculations <- SII_calculations |>
select(any_of(contains(c("sii_lower", "sii_upper",
"rii_lower", "rii_upper")))) |>
rename_with(.fn = \(x) paste0(x, conf_formatted))
}
) |>
bind_cols()
}
)
) |>
select(all_of(grouping_variables), "CI_calcs")
# Add CIs to model
# join on dataset with SII/ RII confidence limits
# Get CIs from first round of reps
# Unnest confidence limits in a data frame for joining
CI_rep1 <- popsSII_model_CIs %>%
select(all_of(grouping_variables), "CI_calcs") %>%
tidyr::unnest("CI_calcs") |>
slice_head(n = 1)
if (length(grouping_variables) > 0) {
# (grouped dataset)
popsSII_model <- popsSII_model %>%
left_join(CI_rep1, by = grouping_variables)
} else {
# ungrouped dataset
popsSII_model <- popsSII_model %>%
cbind(CI_rep1)
}
# Add reliability stats
if (isTRUE(reliability_stat)) {
reliabity_stats <- calc_reliability(
CI_data = popsSII_model_CIs,
confidence = confidence,
rii = rii
)
if (length(grouping_variables) > 0) {
# (grouped dataset)
popsSII_model <- popsSII_model %>%
left_join(reliabity_stats, by = grouping_variables)
} else {
# ungrouped dataset
popsSII_model <- popsSII_model %>%
cbind(reliabity_stats)
}
}
}
# Part 3 - Choose and format output fields --------------------------------
# Remove reliability stat columns (if not requested by user)
if (reliability_stat == FALSE) {
popsSII_model <- popsSII_model %>%
select(-contains("mad"))
}
# Remove RII columns (if not requested by user)
if(rii == FALSE) {
popsSII_model <- popsSII_model %>%
select(!contains("rii"))
}
# Remove intercept columns (if not requested by user)
if(intercept == FALSE) {
popsSII_model <- popsSII_model %>%
select(!"intercept")
}
# Move intercept to last column of dataframe
if(intercept == TRUE) {
popsSII_model <- popsSII_model %>%
select(!"intercept", "intercept")
}
# Add metadata columns to output dataset (if requested by user)
if (type == "full") {
popsSII_model <- popsSII_model %>%
mutate(indicator_type = if_else(value_type == 0, "normal",
if_else(value_type == 1,
"rate", "proportion")),
multiplier = multiplier,
transform = if_else(transform == TRUE & value_type == 1, "log",
if_else(transform == TRUE & value_type == 2, "logit",
"none")),
CI_confidence = paste0(confidence * 100, "%",
collapse = ", "),
CI_method = paste0("simulation ", repetitions, " reps"))
}
# return output dataset
return(popsSII_model)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.