Description Usage Arguments Details Value Note
Sales comparables are recent sales which have the same (or very similar) characteristics as a target (unsold) property. They are frequently used by assessors, real estate agents, and appraisers to determine the fair market value of a home. However, finding comparable properties at scale can be difficult.
This function can be used to quickly find comparables for any number of unsold properties. It can also be used more generally to find similar properties that are nearby each other, regardless of whether or not they sold.
See the documentation site for example usage.
1 2 3 4 5 6 7 8 9 10 11 |
data |
A data frame containing the variables to cluster on. Should contain both numerics and factors. Numerics should be unscaled. Lat/lon should NOT be included. |
lon |
A numeric vector of longitude values, reprojected into planar coordinates specific to the target area. See here for details on reprojection using R. |
lat |
A numeric vector of latitude values, reprojected into planar coordinates specific to the target area. See here for details on reprojection using R. |
m |
The number of clusters to create using the
|
k |
The number of nearest neighbors to return for each row of input data. |
l |
Hyperparameter representing the trade-off between distance and characteristics in kNN matching. Must be >= 0 and <= 1. Value equal to 1 will match on distance only, while value equal to 0 will disregard distance and match on characteristics only. Default 0.5 (equal weight). |
var_weights |
Value(s) passed to |
keep_data |
Logical for whether original data should be included in the returned object. |
... |
Arguments passed on to |
The cknn
algorithm works in two stages:
Divide the full set of sales into m
clusters according to each
property's characteristics. This mimics the process of market segmentation
or separating properties into different classes. This clustering is done
using the k-prototypes function kproto
from
the clustMixType library.
See the clustMixType whitepaper
for more information.
For each property i
, find the k
nearest neighbors within
i
's cluster, minimizing the distance over planar coordinates and
Euclidean distance to all cluster centers, This is accomplished with the
fast kNN function from kNN
.
Options for inputs to var_weights
include:
A named list with names corresponding to column names in the input data.
Names not included in the list are assumed to have a value of 1. These
named values are multiplied by the variance estimates created by
lambdaest
. Higher values will weight variables
more heavily during clustering.
A p
long unnamed vector, where p
is equal to the number
columns in the input data. These weights are not multiplied by the variance
estimates created by lambdaest
.
A single unnamed numeric value. This value trades off the relative importance of numeric versus categorical variables. Higher values will more heavily weight categorical variables, while a value of 0 replicates standard k-means (numerics only).
A NULL
value. This uses the default estimates produced by
lambdaest
. All variables are weighted equally.
Object of class cknn
containing:
kproto |
|
knn |
List of |
knn_idx |
Lookup for translating in-cluster index positions to row indices from the input data. Used by predict method. |
lon |
Unaltered input longitude vector. Used by predict method for scaling new input data. |
lat |
Unaltered input latitude vector. Used by predict method for scaling new input data. |
var_weights |
Unaltered variable weights used to construct the cknn model. |
m |
Number of clusters created by
|
k |
Number of nearest neighbors returned by
|
l |
Hyperparameter used for distance/characteristics trade-off. |
data |
Unaltered input data frame. Used by predict method for
scaling new input data. Only returned if |
Input data should be thoroughly cleaned. Outliers in numeric vectors and factors with rare levels can both affect clustering performance. Outlier values should be removed. Rare factor levels should be collapsed into a single level or removed.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.