topn_mixed: Top N Control Selector for 1 or more Test...

topn_mixedR Documentation

Top N Control Selector for 1 or more Test Group(s)/Individual(s), with Mixed Input Variables/Metrics.

Description

Selects n nearest control groups/individuals for 1 or more test group(s)/individual(s)

Usage

topn_mixed(df, topN = 5, test_list = NULL)

Arguments

df

data frame of numeric, or mixed inputs. First column must have group/individuals names, 1 line per group/individuals.

topN

size of the top "N" of groups/individuals that match the test group/individuals. Defaults to 5.

test_list

df with one column named "TEST," and one row for each label for each group/individual. Defaults to NULL, but be aware if test_list is left blank, the function will use all the unique labels in the first column of the dataframe, resulting in labels x topN rows dataframe.

Details

Providing a complete list of the groups/individuals to df, and supplying a data frame with 1 or more TEST group(s)/individual(s) to the parameter test_list, the function will provide you with an "N" list of control groups/individuals. If more than 1 group/individual is provided there is a good chance of duplicates. This function ignores duplicates in the control for more than 1 TEST group, resulting in a dataframe with Labels x N rows. This function can handle both numeric and categorical as well as just numeric with Gower's methodology in cluster::daisy() function.

Examples

library(dplyr)
library(magrittr)
df <- datasets::USArrests %>% dplyr::mutate(state = base::row.names(datasets::USArrests)) %>%
  base::cbind(datasets::state.division) %>%
  dplyr::select(state, dplyr::everything())

test_list <- dplyr::tribble(~"TEST","Colorado")
TOPN_CONTROL_LIST <- TestContR::topn_mixed(df, topN = 5, test_list = test_list)

Fredo-XVII/TestContR documentation built on Nov. 5, 2022, 5:54 p.m.