match_mixed: Test and Control Selector for Groups/Individuals, with Mixed...

match_mixedR Documentation

Test and Control Selector for Groups/Individuals, with Mixed Input Variables/Metrics.

Description

Randomly select test groups/individuals and create matching control groups/individuals by using Euclidean distance on scaled numeric variables, or with Gower's method for datasets with numeric and categorical variables. This function can handle both numeric and categorical as well as just numeric variables with Gower's methodology from cluster::daisy() function.

Usage

match_mixed(df, n = 10, test_list = NULL)

Arguments

df

data frame of numeric, or mixed inputs. First column must have group/individuals names, 1 line per group/individuals.

n

size of the test group, and matching control group. Defaults to 10. Will be ignored if df provide to the "test_list" parameter.

test_list

df with one column named "TEST." This has a list of members in the current test. Defaults to NULL.

Details

The data frame must contain the group/individual labels in the first column and the other variables must be in levels, in other words not scaled.

In the case where duplicates arise in the Control, the function iterates through the test control list until there are no duplicates in the Control. In each iteration, it re-ranks the remaining possible control groups/individuals and matches to the test on the lowest distance.

You can supply a data frame of pre-selected test groups/individuals to the parameter test_list and the function will provide you with a list of control groups/individuals.

Value

If the "n" parameter is used, the function outputs a data frame with a list of randomized test groups/individuals from the supplied df with matching control groups/individuals, a 1 to 1 match. If a data frame is supplied to the "test_list" parameter, 1 to 1 matching control stores will be created for the groups/individuals in the "TEST" column supplied to the "test_list" parameter.

Examples

library(dplyr)
library(magrittr)
df <- datasets::USArrests %>% dplyr::mutate(state = base::row.names(datasets::USArrests)) %>%
  base::cbind(datasets::state.division) %>%
  dplyr::select(state, dplyr::everything())

TEST_CONTROL_LIST <- TestContR::match_mixed(df, n = 15)

Fredo-XVII/TestContR documentation built on Nov. 5, 2022, 5:54 p.m.