hellwig: Hellwig's method for choosing subset of independet variables

Description Usage Arguments Details Value References Examples

View source: R/hellwig.R

Description

Hellwig's method selects a subset of independent variables in a linear regression model based on their correlations with some dependent variable as well as correlations between themselves. The goal is to select a subset of variables which are fairly independent from each other but highly correlated with the dependent variable.

Usage

1
hellwig(y, x, method = "pearson")

Arguments

y

numeric, dependent variable

x

numeric matrix, independent variables

method

character, type of correlation measures used, passed to cor

Details

Given m independent variables Hellwig's method consists of evaluating all 2^m - 1 combinations using the following steps:

  1. Individual capacity of an independent variable in a subset is given by:

    h_kj = r_0j^2 / sum_{i \in I} r_ij

    where r_0j is correlation of j-th independent variable with the dependent variable, r_ij is a correlation with i-th and j-th dependent variable, and I is a focal set of independent variables.

  2. Integral capacity of information for every combination k is equal to:

    H_k = sum_j h_kj

The subset with the highest value of H_k should be selected.

Value

Data frame with two columns: k combination of independent variables in the form of x-y-z where x, y, z... are the indices of columns in x, and h the capacity of the subset H_k.

References

TODO Add references

Examples

1
2
3
4
set.seed(1234)
x <- matrix(rnorm(1000), 250, 4)
y <- rnorm(250)
hellwig(y, x)

mbojan/mbtools documentation built on Nov. 9, 2017, 3:21 p.m.