contract: contracts explanatory variables of high absolute correlation

Description Usage Arguments Value Author(s) Examples

View source: R/contselec-package8.R

Description

This function converts data0 into data1, by 1) grouping variables of high correlation, and by 2) replacing each group's variables with its PCA scores. The column specified by "target" is treated as the explained variable, named "y", which is untatched.

Usage

1
contract(data0, edge_explain = -1, edge_cor, np = -1, target = 1)

Arguments

data0

data.frame : explained- and explanatory-variables.

edge_explain

real number : threshold for the variance ratio to be explained. The number of principal components "np" for representing the contracted group is determined so that [variance by the np principal components]/[total variance] exceeds "edge_explain".

edge_cor

real number : threshold for clustering; variables of correlation no less than this threshold are clustered together.

np

integer : number of principal components to represent the contracted group, if assigned.

target

integer or character : the column specified by "target" is treated as the explained variable "y", which is untatched.

Value

list(data1,group,data0)), where "data1"(data.frame) is contracted data. "group" (list(gid,gid_data1,ngid,vname1_orig,vname0_orig)) contains information about contraction of "data0" into "data1". "data0" is the data before contraction.

Author(s)

Hiroshi C. Ito

Examples

1
2
3
4
5
6
7
8
9
data(Cars93, package = "MASS");
data=Cars93;
data=data[complete.cases(data),];
data=data[,sapply(data[1,],is.numeric)];
con=contract(data,edge_cor=0.9,edge_explain=0.6,target="Horsepower")

con$target;
print(data.frame(uncontracted.name=colnames(con$data0)));
print(data.frame(contracted.name=colnames(con$data1),members=con$group$vname1_orig));

yorickuser/contselec documentation built on July 25, 2021, 8:14 a.m.