This vignette describes the steps of using the taxr
package to conduct a ratio study analysis, using a data set of 2018 sales from King County, Washington. This package and vignette makes use of the tidyverse
.
knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(taxr); library(tidyverse)
Ratio studies are a standard method of analyzing tax assessment accuracy. The main unit of analysis is the assessed to sale price ratio (ASR) for properties that sold in a relevant time period. For more on ratio studies, see https://www.iaao.org/media/standards/Standard_on_Ratio_Studies.pdf.
data("ex_tax_data") dim(ex_tax_data)
The data is a random sample of 1,000 sales from King County, Washington. The raw files are publically provided by the county assessor. Sales and assessment data files were joined by parcel identification number (PIN). The raw data can be found at the following URL:
https://info.kingcounty.gov/assessor/DataDownload/default.aspx
knitr::kable(head(ex_tax_data))
The taxr::standardize
is used to ensure naming and type standardization. The arguments of the function specify column names that are expected in real estate data. You give a character value indicating the column name that is to be renamed. A logical argument calc_asr
will calculate $ASR = \frac{assessed}{saleprice}$ if no ASR column in provided.
df <- ex_tax_data %>% standardize( uid = 'PIN', assessed = 'ApprVal', saleprice = 'SalePrice', saledate = 'DocumentDate', living_area = 'SqFtTotLiving', yearbuilt = 'YrBuilt' )
This package does not have specific functionality for all of these fields, for example yearbuilt
. However, a comprehensive ratio study would likely include some analysis of these fields. It is therefore good practice to have a standardized set of column names to work with, so user built code will be useful across projects.
To identify outliers, the isOutlier
function supports two methods, an inner quartile range (IRQ) method and a percentile method. The IQR method by default identifes outliers that are more than 3 times the IQR below the first quartile or above the third. This threshold can be changed with the function parameter val
sum(isOutlier(df$asr, method = 'iqr', val = 3))
sum(isOutlier(df$asr, method = 'iqr', val = 1.5))
The percentile method identifies the inner percentage of values, by default the inner 99.5%. This can be changed with the inner_perc
argument
sum(isOutlier(df$asr, method = 'ptile', inner_perc = .995))
sum(isOutlier(df$asr, method = 'ptile', inner_perc = .95))
For the rest of the vignette, the default outlier method of 3 times the IQR will be used.
df <- df %>% filter(!isOutlier(asr))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.