Description Usage Arguments Details Value References See Also Examples
View source: R/repsalefourier.R
Standard and Weighted Least Squares Repeat Sales Estimation using Fourier Expansions
1 2 3 |
price0 |
Earlier price in repeat sales pair |
time0 |
Earlier time in repeat sales pair |
price1 |
Later price in repeat sales pair |
time1 |
Later time in repeat sales pair |
mergefirst |
Number of initial periods with coefficients constrained to zero. Default: mergefirst=1 |
q |
Sets Q for the fourier expansion. Default: q=1. |
graph |
If TRUE, graph results. Default: graph=T |
graph.conf |
If TRUE, add confidence intervals to graph. Default: graph.conf=T |
conf |
Confidence level for intervals. Default: .95 |
stage3 |
If stage3 = NULL, no corrections for heteroskedasticty. If stage3="abs", uses the absolute value of the first-stage residuals as the dependent variable in the second-stage regression. If stage3="square", uses the square of the first-stage residuals as the dependent variable. Default: stage3=NULL. |
stage3_xlist |
List of explanatory variables for heteroskedasticity. By default, the single variable timesale = time1-time0 is constructed and used as the explanatory variable when stage3="abs" or stage3="square". Alternatively, a formula can be provided for a user-specified list of explanatory variables, e.g., stage3_xlist=~x1+x2. Important: note the "~" before the variable list. |
print |
If print=T, prints the regression results. Prints one stage only – the first stage when stage=NULL and the final stage when stage3="square" or stage3="abs". Default: print=T. |
The repeat sales model is
y(t) - y(s) = δ(t) - δ(s) + u(t) - u(s)
where y is the log of sale price, s denotes the earlier sale in a repeat sales pair, and t denotes the later sale. Each entry of the data set should represent a repeat sales pair, with price0 = y(s), price1 = y(t), time0 = s, and time1 = t. The function repsaledata can help transfer a standard hedonic data set to a set of repeat sales pairs.
The repeat sales model can be derived from a hedonic price function with the form y_{i,t} = δ_t + X_i β + u_{i,t} where X_i is a vector of variables that are assumed constant over time. repsalefourier replaces δ_t with a smooth continuous function, g(T_i) where T_i denotes the time of sale for observation i. Letting g(T_i) = α_0 + α_1 z_i + α_2 z_i^2 + ∑_{i=1}^Q \{λ_q sin(qz_i) + γ_q cos(qz_i) \} , where z_i = 2 π (T_i - min(T_i))/(max(T_i) - min(T_i)) , the repeat sales model becomes y_{i,t} - y_{i,s} = g(T_i) - g(T_i^s) =
α_1 (z_i - z_i^s) + α_2 (z_i^2 - z_i^{s2}) + ∑_{q=1}^Q \{ λ_q (sin(qz_i) - sin(qz_i^s)) + γ_q (cos(qz_i) - cos(z_i^s)) \} + u_{i,t} - u_{i,t-s}
After imposing the constraint that the price index in the base time period equals zero, the index is constructed from the estimated regression using the following expression:
g(T_i) = α_1 z_i + α_2 z_i^2 + ∑_{q=1}^Q \{ λ_q sin(qz_i) + γ_q (cos(qz_i) - 1) \}
More details can be found in McMillen and Dombrow (2001).
Repeat sales estimates are sometimes very sensitive to sales from the first few time periods, particularly when the sample size is small. The option mergefirst indicates the number of time periods for which the price index is constrained to equal zero. The default is mergefirst = 1, meaning that the price index equals zero for just the first time period. The repsalefourier command does not have an option for including an intercept in the model.
Following Case and Shiller (1987), many authors use a three-stage procedure to construct repeat sales price indexes that are adjusted for heteroskedasticity related to the length of time between sales. Common specifications for the second-stage function are e^2 = α0 + α1 (t-s) or |e| = α0 + α1 (t-s), where e represents the first-stage residuals. The first equation implies an error variance of σ^2 = e^2 and the second equation leads to σ^2 = |e|^2. The repsalefourier function uses a standard F test to determine whether the slope cofficients are significant in the second-stage regression. The results are reported if print=T.
The third-stage equation is
(y(t) - y(s))/σ = (g(T) - g(T_s))/σ + (u(t) - u(s))/σ
This equation is estimated by regressing y(t) - y(s) on z, z^2, sin(z)...sin(Qz), cos(z)...cos(Qz) using the weights option in lm with weights = 1/sigma^2
fit |
Full regression model. |
pindex |
The estimated price index. |
lo |
The lower bounds for the price index confidence intervals. |
hi |
The upper bounds for the price index confidence intervals. |
dy |
The dependent variable for the repeat sales regression, dy = price1-price0. |
xmat |
The matrix of explanatory variables for the repeat sales regressions. dim(xmat) = 2 + 2Q. |
Case, Karl and Robert Shiller, "Prices of Single-Family Homes since 1970: New Indexes for Four Cities," New England Economic Review (1987), 45-56.
McMillen, Daniel P. and Jonathan Dombrow, "A Flexible Fourier Approach to Repeat Sales Price Indexes," Real Estate Economics 29 (2001), 207-225.
repsale
repsaledata
repsaleqreg
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | set.seed(189)
n = 2000
# sale dates range from 0-50
# drawn uniformly from all possible time0, time1 combinations with time0<time1
tmat <- expand.grid(seq(0,50), seq(0,50))
tmat <- tmat[tmat[,1]<tmat[,2], ]
tobs <- sample(seq(1:nrow(tmat)),n,replace=TRUE)
time0 <- tmat[tobs,1]
time1 <- tmat[tobs,2]
timesale <- time1-time0
timesale2 <- timesale^2
par(ask=TRUE)
z0 <- 2*pi*time0/50
z0sq <- z0^2
sin0 <- sin(z0)
cos0 <- cos(z0)
z1 <- 2*pi*time1/50
z1sq <- z1^2
sin1 <- sin(z1)
cos1 <- cos(z1)
ybase0 <- z0 + .05*z0sq -.5*sin0 - .5*cos0
miny <- min(ybase0)
ybase0 <- ybase0-miny
ybase1 <- z1 + .05*z1sq -.5*sin1 - .5*cos1 - miny
maxy <- max(ybase1)
ybase0 <- ybase0/maxy
ybase1 <- ybase1/maxy
summary(data.frame(ybase0,ybase1))
sig1 = sd(c(ybase0,ybase1))/2
y0 <- ybase0 + rnorm(n,0,sig1)
y1 <- ybase1 + rnorm(n,0,sig1)
fit <- lm(y0~z0+z0sq+sin0+cos0)
summary(fit)
plot(time0,fitted(fit))
fit <- lm(y1~z1+z1sq+sin1+cos1)
summary(fit)
plot(time1,fitted(fit))
fit1 <- repsale(price1=y1,price0=y0,time1=time1,time0=time0,graph=FALSE,
mergefirst=5)
fit2 <- repsalefourier(price1=y1,price0=y0,time1=time1,time0=time0,q=1,
graph=FALSE,mergefirst=5)
timevar <- seq(0,50)
plot(timevar,fit1$pindex,type="l",xlab="Time",ylab="Index",
ylim=c(min(fit1$pindex),max(fit2$pindex)))
lines(timevar,fit2$pindex)
# variance rises with timesale
# var(u0) = sig1^2; var(u1) = (sig1 + timesale/50)^2
# var(u1-u0) = var(u0) + var(u1) = 2*(sig1^2) + 2*sig1*timesale/10 + (timesale^2)/2500
y0 <- ybase0 + rnorm(n,0,sig1)
y1 <- ybase1 + rnorm(n,0,sig1+timesale/50)
par(ask=TRUE)
fit1 <- repsalefourier(price0=y0, price1=y1, time0=time0, time1=time1,
graph=FALSE)
fit2 <- repsalefourier(price0=y0, price1=y1, time0=time0, time1=time1,
graph=FALSE,stage3="abs",stage3_xlist=~timesale+timesale2)
plot(timevar,fit1$lo,type="l",xlab="Time",ylab="Index",
ylim=c(min(fit1$lo,fit2$lo),max(fit1$hi,fit2$hi)))
lines(timevar,fit1$hi)
lines(timevar,fit2$lo,col="red")
lines(timevar,fit2$hi,col="red")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.