View source: R/featurefinder.r
findFeatures | R Documentation |
Perform analysis of residuals grouped by factor to identify features which explain the target variable
findFeatures(
OutputPath,
fcsv,
ExclusionVars,
FactorToNumericList,
treeGenerationMinBucket = 50,
treeSummaryMinBucket = 20,
treeSummaryResidualThreshold = 0,
treeSummaryResidualMagnitudeThreshold = 0,
doAllFactors = TRUE,
maxFactorLevels = 20
)
OutputPath |
A string containing the location of the input csv file. Results are also stored in this location. |
fcsv |
A string containing the name of a csv file |
ExclusionVars |
A string consisting of a list of variable names with double quotes around each variable |
FactorToNumericList |
A list of variable names as strings |
treeGenerationMinBucket |
Desired minimum number of data points per leaf (default 50) |
treeSummaryMinBucket |
Minimum number of data points in each leaf for the summary (default 20) |
treeSummaryResidualThreshold |
Minimum residual in the summary (default 0 for positive residuals) |
treeSummaryResidualMagnitudeThreshold |
Minimum residual magnitude in the summary (default 0 i.e. no restriction) |
doAllFactors |
Flag to indicate whether to analyse the levels of all factor variables (default TRUE) |
maxFactorLevels |
(maximum number of levels per factor before it is converted to numeric (default 20) |
Saves residual CART trees and associated highlighted residuals for each to the path provided.
require(featurefinder)
data(mycsv)
data$SMIfactor=paste("smi",as.matrix(data$SMIfactor),sep="")
nn=floor(length(data$DAX)/2)
# Can we predict the relative movement of DAX and SMI?
data$y=data$DAX*0
data$y[1:(nn-1)]=((data$DAX[2:nn])-(data$DAX[1:(nn-1)]))/
(data$DAX[1:(nn-1)])-(data$SMI[2:nn]-(data$SMI[1:(nn-1)]))/(data$SMI[1:(nn-1)])
thismodel=lm(formula=y ~ .,data=data)
expected=predict(thismodel,data)
actual=data$y
residual=actual-expected
data=cbind(data,expected, actual, residual)
OutputPath=tempdir()
fcsv <- file.path(OutputPath, "mycsv.csv")
write.csv(data[(nn+1):(length(data$y)),], file = fcsv, row.names=FALSE)
ExclusionVars="\"residual\",\"expected\", \"actual\",\"y\""
FactorToNumericList=c()
findFeatures(OutputPath, fcsv, ExclusionVars,FactorToNumericList,
treeGenerationMinBucket=50,
treeSummaryMinBucket=20)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.