getStats: Get table of indicator statistics for any data set

Description Usage Arguments Value Examples

View source: R/coin_preanalyse.R

Description

Takes a COIN or data frame and returns a table of statistics for each column, including max, min, median, mean, standard deviation, kurtosis, etc. Flags indicators with possible outliers, and checks for collinearity with other indicators and denominators (if any). Also checks number of unique values and percentage of zeros. Also returns correlation matrices and a table of outliers, as a list.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
getStats(
  COIN,
  icodes = NULL,
  dset = "Raw",
  out2 = "COIN",
  cortype = "pearson",
  t_skew = 2,
  t_kurt = 3.5,
  t_colin = 0.9,
  t_denom = 0.7,
  t_missing = 65,
  IQR_coef = 1.5
)

Arguments

COIN

A COIN object or data frame of indicator data

icodes

A character vector of indicator names to analyse. Defaults to all indicators.

dset

The data set to analyse.

out2

Where to output the results: if "COIN" (default), appends to the COIN, otherwise if "list", outputs to a separate list.

cortype

The type of correlation to calculate, either "pearson", "spearman", or "kendall". See stats::cor.

t_skew

Skewness threshold.

t_kurt

Kurtosis threshold.

t_colin

Collinearity threshold (absolute value of correlation).

t_denom

High correlation with denominator threshold.

t_missing

Missing data threshold, in percent.

IQR_coef

Interquartile range coefficient, used for identifying outliers.

Value

If out2 = "COIN" (default), results are appended to the COIN in .$Analysis, otherwise if out2 = "list", outputs to a separate list. In both cases, the result is a list containing:

Examples

1
2
3
4
# build ASEM COIN
ASEM <- assemble(IndData = ASEMIndData, IndMeta = ASEMIndMeta, AggMeta = ASEMAggMeta)
# get list of stats from raw data set
stat_list <- getStats(ASEM, dset = "Raw", out2 = "list")

COINr documentation built on Nov. 30, 2021, 9:06 a.m.