build.hf.lm.split: Hedonic Function Based on a List of Linear Models
In hepi: Functions for estimating hedonic elementary price indices

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/hepi.hf2.R

This function estimates hedonic functions based on linear regression models for a given dataset. Individual sub-models are fit to subsets of the data split up according to a factor variable.

build.hf.lm.split(learndata, split.var, full.formula, min.formula,
    backtrans = I, rm.infl = FALSE, description = NULL, 
    return.row.labels = FALSE, allow.variable.selection = TRUE, 
    use.overall.hf = TRUE, split.threshold = 100)

`learndata`	A `data.frame` containing the training data set.
`split.var`	The name of the factor variable used to split the data into subsets. See Details.
`full.formula`	The formula of the full linear model. See Details.
`min.formula`	If variable selection is wanted, the formula of the minimal linear model. See Details.
`backtrans`	A backtransformation function applied to all predictions. See Details.
`rm.infl`	A logical value indicating whether influential observations should be removed.
`description`	A character string describing the hedonic function.
`return.row.labels`	A logical value indicating whether the row labels of the cleaned training data should be returned.
`allow.variable.selection`	A logical value indicating whether variable selection should be carried out.
`use.overall.hf`	A logical value indicating whether an overall model should be fit to the whole data set.
`split.threshold`	The minimal number of observations required for fitting any sub-model.

This function estimates a hedonic function based on linear regression models. In contrast to build.hf.lm, however, individual linear models are fit to several subsets of the data. These subsets are determined through a factor variable named split.var which needs to be contained in learndata. The minimal size a subset needs to have in order to fit a linear model is given by split.threshold.

If use.overall.hf is TRUE, an overall model for the whole data set is fit and stored additionally in order to predict prices for characteristics vectors belonging to categories of split.var where less than split.threshold observations are available in the learning data set.

See the documentation of build.hf.lm for an explanation of the other arguments of the function. Removal of influential observations and variable selection, if required, is carried out for each sub-model individually.

If return.row.labels == FALSE, the function returns a "hedonic.function" object representing the fitted regression model.

If return.row.labels == TRUE, the function returns a list with following elements:

`hf`	The resulting `"hedonic.function"` object.
`row.labels`	A vector containing the row labels of the cleaned training data set.

Michael Beer r-hepi@michael.beer.name

Beer, M. (2007) Hedonic Elementary Price Indices: Axiomatic Foundation and Estimation Techniques. PhD thesis, University of Fribourg Switzerland, http://www.michael.beer.name/phdthesis.

build.hf.lm

data(boston, package = "spdep")

hf0 <- build.hf.lm.split(
    learndata = boston.c,
    split.threshold = 15, 
    split.var = "TOWN",
    full.formula = log(MEDV) ~ CRIM + ZN + INDUS + CHAS + 
      I(NOX^2) + I(RM^2) + AGE + log(DIS) + log(RAD) + TAX + 
      PTRATIO + B + log(LSTAT), 
    backtrans = exp, 
    rm.infl = FALSE, 
    description = NULL, 
    return.row.labels = FALSE, 
    allow.variable.selection = FALSE)

is.applicable.hf(hf0, boston.c)
summary(hf0(boston.c))

plot(boston.c$MEDV, hf0(boston.c), xlab = "Observed", ylab = "Predicted")
abline(0,1)

hf1 <- build.hf.lm.split(
    learndata = boston.c, 
    split.var = "TOWN",
    split.threshold = 15, 
    full.formula = log(MEDV) ~ CRIM + ZN + INDUS + CHAS + 
      I(NOX^2) + I(RM^2) + AGE + log(DIS) + log(RAD) + TAX + 
      PTRATIO + B + log(LSTAT), 
    min.formula = log(MEDV) ~ 1, 
    backtrans = exp, 
    rm.infl = FALSE, 
    description = NULL, 
    return.row.labels = FALSE, 
    allow.variable.selection = TRUE)
summary(hf1(boston.c))