b_maxvarK: Maximum Variance of Gaussian Kernel Matrix
In chadhazlett/KBAL: Kernel Balancing

View source: R/functions.R

b_maxvarK

R Documentation

Maximum Variance of Gaussian Kernel Matrix

Description

Searches for the argmax of the variance of the Kernel matrix.

Usage

b_maxvarK(data, useasbases, cat_data = TRUE, maxsearch_b = 2000)

Arguments

`data`	a matrix of data where rows are all units and columns are covariates. Where all covariates are categorical, this matrix should be one-hot encoded (refer to `one_hot` to produce) with `cat_data` argument true.
`useasbases`	binary vector specifying what observations are to be used in forming bases (columns) of the kernel matrix. Suggested default is: if the number of observations is under 4000, use all observations; when the number of observations is over 4000, use the sampled (control) units only.
`cat_data`	logical for whether kernel contains only categorical data or not. Default is `TRUE`.
`maxsearch_b`	the maximum value of `b`, the denominator of the Gaussian, searched during maximization. Default is `2000`.

Value

`b_maxvar`	numeric `b` value, the denominator of the Gaussian, which produces the maximum variance of `K` kernel matrix
`var_K`	numeric maximum variance of `K` kernel matrix found with `b` as `b_maxvar`

Examples


#lalonde with only categorical data
set.seed(123)
data("lalonde")
# Select a random subset of 500 rows
lalonde_sample <- sample(1:nrow(lalonde), 500, replace = FALSE)
lalonde <- lalonde[lalonde_sample, ]

cat_vars <- c("black","hisp","married","nodegr","u74","u75")
#Convert to one-hot encoded data matrix:
onehot_lalonde = one_hot(lalonde[, cat_vars])
colnames(onehot_lalonde)
best_b <- b_maxvarK(data = onehot_lalonde, 
                    useasbases = 1-lalonde$nsw)

chadhazlett/KBAL documentation built on Sept. 23, 2024, 11:48 a.m.