ff_sigtest: Matrix of significance tests

View source: R/analysis.R

ff_sigtestR Documentation

Matrix of significance tests

Description

This function returns a square symetrical matrix of all significance tests for all combinations of values. It calcualtes the p-value from either the z test statistic or Chi-Square test statistic. A two-sided significance test is conducted and the null hypothesis is that there is no difference between the two parameters. The matrix length and width equal the number of rows in the data frame.

Usage

ff_sigtest(
  data_frame,
  estimate,
  se,
  test = "zscore",
  success = NULL,
  trials = NULL,
  var_names = NULL,
  pretty_print = FALSE,
  table_name = NULL
)

Arguments

data_frame

A dataframe containing estimates and either standard errors for z-score test or successes and trials for a Chi-Square test.

estimate

The column name of the number to conduct significance tests on.

se

The column name of the standard error of the estimate. Required if test a z-score test

test

The significance test to conduct. Either "zscore" or "chi-square". Defaults to 'zscore'.

success

The column name of the number of successful trials. Required for Chi-Square test.

trials

The column name of the total number of trials. Required for Chi-Square test.

var_names

A character vector of variables that can be combined to create distinct names for each row and column.

pretty_print

Boolean (TRUE / FALSE) indicating whether to return the table as a Kable HTML table that bolds statistically significant finding and creates other stylistic changes. Default is FALSE.

table_name

Character string to use as the name of the Kable table. Only used if 'pretty_print' is TRUE.

Details

The z-score formula comes from: U.S. Census Bureau, A Compass for Understanding and Using ACS Survey Data, A-18.

The z-scores are then converted to p-values using the R function for generating cumulative PDFs: 'pnorm(z_score, lower.tail=FALSE)*2'. The Chi-Square test of proportions uses ‘prop.test' and extracts the p-values from this function’s results.

Value

A square, symmetrical, with a length and width equal the number of rows in the data frame. Each cell in the matrix contains the results of the significance test from the row in the original dataframe represented by the column, and the row represented by the row in the matrix. The cell values signify the p-value of a two-sided test with a null-hypothesis of no difference between the observations.

Examples

df <- data.frame(year = c(2016, 2017),
                 geo_description = c('Forsyth County, NC', 'Guilford County, NC'),
                 estimate = c(1,2),
                 se = c(.2, .3),
                 success = c(10, 12),
                 trials = c(15, 19))

# Z score test
ff_sigtest(data_frame = df, estimate = 'estimate', se = 'se',
           test = 'zscore', var_names = c('year', 'geo_description'))

# Chi-Square test
ff_sigtest(data_frame = df, estimate = 'estimate', success = 'success', trials = 'trials',
           test = 'chi-square', var_names = c('year', 'geo_description'))

forsythfuture/FFtools documentation built on April 5, 2022, 10:02 p.m.