panelAR: Estimation of Linear AR(1) Panel Data Models with...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/panelAR.R

Description

The function estimates linear models on panel data structures in the presence of AR(1)-type autocorrelation as well as panel heteroskedasticity and/or contemporaneous correlation. First, AR(1)-type autocorrelation is addressed via a two-step Prais-Winsten feasible generalized least squares (FGLS) procedure, where the autocorrelation coefficients may be panel-specific. Subsequently, one can choose to implement ‘sandwich’-type robust standard errors with OLS, panel weighted least squares (WLS), panel-corrected standard errors (PCSEs), or the Parks-Kme4nta FGLS estimator.

Usage

1
2
3
4
5
panelAR(formula, data, panelVar, timeVar, autoCorr = c("ar1", 
    "none", "psar1"), panelCorrMethod = c("none","phet","pcse","pwls",
    "parks"), rhotype ="breg", bound.rho = FALSE, rho.na.rm = FALSE, 
    panel.weight = c("t-1", "t"), dof.correction = FALSE, 
    complete.case = FALSE, seq.times = FALSE, singular.ok=TRUE) 

Arguments

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.

data

a data frame containing the variables in the model, as well as a variables defining the units and time.

panelVar

the column name of data that contains the panel ID. It cannot contain any NAs. May be set to NULL, in which case all observations are assumed to belong to the same unit.

timeVar

the column of data that contains the time ID. It must be a vector of integers and cannot contain any NAs. Duplicate time observations per panel are not allowed. At least two time periods are required.

autoCorr

character string denoting structure of autocorrelation in the data: ar1 denotes AR(1)-type autocorrelation with a common correlation coefficient across all panels, psar1 denotes AR(1)-type autocorrelation with a unique correlation coefficient for each panel, and none denotes no autocorrelation. Default: ar1.

panelCorrMethod

character string denoting method used for dealing with panel heteroskedasticity and/or correlation. none denotes homoskedasticity and no correlation across panels, phet denotes a Huber-White style sandwich estimator for panel heteroskedasticity, pcse denotes panel-corrected standard errors that are robust to both heteroskedasticity and contemporaneous correlation across panels, pwls denotes that a panel weighted least squares procedure is to deal with panel heteroskedasticity, and parks means that Parks-Kmenta FGLS is used to estimate both panel heteroskedasticity and correlation. Default: none.

rhotype

character string denoting method used for estimating autocorrelation coefficient, ρ. Possible options are breg, scorr, freg, theil, dw, and theil-nagar. See ‘Details’. Default: breg.

bound.rho

logical. If TRUE, the panel-specific autocorrelation coefficient ρ_i is bounded to [-1,1] in the calculation of ρ; used only for autoCorr="ar1". Default: TRUE.

rho.na.rm

logical. If FALSE and ρ_i cannot be calculated for a panel, function returns error. If TRUE, ρ_is that are NA are ignored if calculating a common AR(1) coefficient or set to 0 if calculating panel-specific AR(1) coefficients. Default: FALSE.

panel.weight

the weight to be used for each panel when combining panel-specific autocorrelations ρ_i to a common ρ. Weight is either the number of time periods in the corresponding panel (t) or the number of time periods minus 1 (t-1). Default: t.

dof.correction

logical. If TRUE, standard errors are adjusted by a factor of N/(N-k), where N is total number of observations and k is the rank of the linear model. Default: FALSE.

complete.case

logical. If TRUE, use only the time periods where every panel has a valid observation in the estimation of PCSEs or the Parks-Kmenta estimator. Otherwise, use pairwise procedure. Default: FALSE.

seq.times

logical. If TRUE, observations are temporally ordered by panel and assigned a sequential time variable that ignores any gaps in the runs. Default: FALSE.

singular.ok

logical. If FALSE, a singular failure results in an error. Default: TRUE.

Details

Function for running two-step Prais-Winsten models on panel data that exhibit AR(1)-type autocorrelation. Following the two-step estimation, one can choose to use a ‘sandwich’-type robust standard error estimator with OLS or a panel weighted least squares estimator to address panel heteroskedasticity. Alternatively, if panels are both heteroskedastic and contemporaneously correlated, the package supports panel-corrected standard errors (PCSEs) as well as the Parks-Kmenta FGLS estimator. Note that the Parks-Kmenta estimator should ideally be reserved for use only when the number of time periods is significantly greater than the number of panels (see Beck and Katz). The function is robust to unbalanced panel structures, panels with just one observation, multiple runs per panel, and the presence of panels without any overlapping observations.

While generally designed to estimate Prais-Winsten models on panel data, setting panelVar to NULL will estimate an AR(1) time-series model treating the entire dataset as one unit. In this case, the panelCorrMethod is ignored since equal variances are assumed across all observations.

A number of common estimators for the autocorrelation coefficient are supported. Specifically:

breg

Linear regression estimator: \hat{ρ}_{breg} = \frac{∑_{t=2}^{T_i} \hat{ε}_{i,t}\hat{ε}_{i,t-1}}{∑_{t=1}^{T_i-1} \hat{ε}_{i,t}^2}

scorr

Sample correlation coefficient estimator: \hat{ρ}_{scorr} = \frac{∑_{t=2}^{T_i} \hat{ε}_{i,t}\hat{ε}_{i,t-1}}{∑_{t=1}^{T_i} \hat{ε}_{i,t}^2}

freg

Forward linear regression estimator: \hat{ρ}_{freg} = \frac{∑_{t=1}^{T_i-1} \hat{ε}_{i,t}\hat{ε}_{i,t+1}}{∑_{t=1}^{T_i-1} \hat{ε}_{i,t+1}^2}

theil

Theil estimator: \hat{ρ}_{theil} = \hat{ρ}_{scorr} \frac{T_i-k}{T_i-1}

dw

Durbin-Watson estimator: \hat{ρ}_{dw} = 1-\frac{1}{2} \frac{∑_{t=2}^{T_i} (\hat{ε}_{i,t}-\hat{ε}_{i,t-1})^2}{∑_{t=1}^{T_i} \hat{ε}_{i,t}^2}

theil-nagar

Theil-Nagar estimator: \hat{ρ}_{theil-nagar} = \frac{T_i^2 \hat{ρ}_{dw} + k^2}{T_i^2-k^2}

In the expressions above, \hat{ε} denotes observed residuals from the first stage OLS regression, T_i is the number of observations in panel i, and k is the rank of the model matrix. Some of these estimators cannot be calculated for panels with one observation or multiple runs of one observation. In these cases, rho.na.rm controls the treatment of these autocorrelation coefficients. If TRUE, ignore panel-specific autocorrelation coefficients for panels where ρ_i returns NA if calculating a common AR(1) coefficient, and set them to 0 if calculating panel-specific AR(1) coefficients.

If PCSEs or the Parks-Kmenta estimator are selected, the default is to use all pairwise observations to estimate the time-constant covariances across units. In the case of no overlapping observations between panels, the panel covariance is assumed to be 0. If complete.case is set to TRUE, then only the time periods where every panel has a valid observation are used for the calculation of the contemporaneous correlation matrix.

Value

panelAR returns an object of class "panelAR".

The function summary can be used to obtain and print a summary of the results. Note that default methods coefficients, fitted.values, and residuals returns vectors of regression coefficients, fitted values, and residuals, respectively. vcov returns the estimated variance-covariance matrix of the coefficients.

An object of class "panelAR" contains the following components, very similar to the outputs of the standard lm function:

coefficients

the named vector of coefficients.

residuals

the residuals.

fitted.values

the fitted mean values.

rank

the numeric rank of the fitted linear model.

df.residual

the residual degrees of freedom.

call

the matched call.

terms

the terms object used.

model

the model frame used.

aliased

named logical vector designating if original coefficients are aliased.

na.action

information returned by model.frame in the handling of NAs.

vcov

estimated variance-covariance matrix of coefficients.

r2

R^2 based on quasi-differenced data from the Prais-Winsten regression. Set to NULL if PWLS or Parks-Kmenta procedures are used.

panelStructure

a list of several objects which contain information on the panel structure of the data. See details below.

Details of panelStructure:

obs.mat

logical matrix of dimension N_p \times T, where N_p is the number of panels. If cell value is TRUE, panel i at time t has a valid observation. Panel structure is balanced if entire matrix is TRUE.

rho

autocorrelation parameters. Scalar if "ar1" option was used, vector of length N_p (number of panels) if "psar1" option was used, and NULL if "none" option was used.

Sigma

N_p \times N_p matrix of estimated panel covariances.

N.cov

number of panel covariances estimated.

Author(s)

Konstantin Kashin [email protected]

References

Beck, Nathaniel and Jonathan N. Katz. 1995. “What to do (and not to do) with time-series cross-section data.” Am. Polit. Sci. Rev. 89:634-47.

Greene, William H. 2012. Econometric Analysis. 7ed. Prentice Hall.

Judge, George G., William E. Griffiths, R. Carter Hill, Helmut Lütkepohl, and Tsoung-Chao Lee. 1985. The Theory and Practice of Econometrics. 2ed. John Wiley & Sons.

Prais, S., and C. Winsten. 1954. “Trend Estimation and Serial Correlation.” Cowles Commission Discussion Paper No. 383, Chicago.

See Also

summary.panelAR for summary.

predict.panelAR for prediction.

plot.panelAR to plot image of panel structure.

run.analysis for analysis of runs.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Common AR(1) with PCSE
data(Rehm)
out <- panelAR(NURR ~ gini + mean_ur + selfemp + cum_right + tradeunion + deficit + 
tradeopen + gdp_growth, data=Rehm, panelVar='ccode', timeVar='year', autoCorr='ar1', 
panelCorrMethod='pcse', rho.na.rm=TRUE, panel.weight='t-1', bound.rho=TRUE)
summary(out)

# Panel-specific AR(1) with PCSE
data(WhittenWilliams)
# expect warning urging to use 'complete.case=FALSE' 
out2 <- panelAR(milex_gdp~lag_milex_gdp+GOV_rl+gthreat+GOV_min+GOV_npty+election_yr+
lag_real_GDP_gr+cinclag+lag_alliance+lag_cinc_ratio+lag_us_change_milex_gdp, 
data=WhittenWilliams, panelVar="ccode", timeVar="year", autoCorr="psar1", 
panelCorrMethod="pcse", complete.case=TRUE) 
summary(out2)
summary(out2)$rho

# Panel-specific AR(1) correlation with PWLS	
data(BrooksKurtz)
out3 <- panelAR(kaopen ~ ldiffpeer + ldiffisi + ldiffgrowth + ldiffinflation + 
ldiffneg + ldiffembi + limf + isi_objective + partisan + checks +  lusffr + 
linflation + lbankra + lcab + lgrowth +  ltradebalance + lngdpcap + lngdp + 
brk + timetrend + y1995, data=BrooksKurtz, panelVar='country', timeVar='year', 
autoCorr='psar1', panelCorrMethod='pwls',rho.na.rm=TRUE, panel.weight='t', 
seq.times=TRUE)
summary(out3)

Example output

Panel-specific correlations bounded to [-1,1]

Panel Regression with AR(1) Prais-Winsten correction and panel-corrected standard errors

Unbalanced Panel Design:                                             
 Total obs.:       75 Avg obs. per panel 3.75
 Number of panels: 20 Max obs. per panel 4   
 Number of times:  4  Min obs. per panel 1   

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 98.849636   9.463231  10.446 1.27e-15 ***
gini        -1.517884   0.383107  -3.962 0.000185 ***
mean_ur      0.030108   0.226004   0.133 0.894426    
selfemp      0.286876   0.095881   2.992 0.003895 ** 
cum_right   -0.012181   0.002015  -6.044 7.76e-08 ***
tradeunion   0.029370   0.047242   0.622 0.536292    
deficit      0.497009   0.242515   2.049 0.044401 *  
tradeopen    0.060517   0.022469   2.693 0.008959 ** 
gdp_growth  -0.702535   0.504288  -1.393 0.168258    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-squared: 0.8504
Wald statistic: 163.8146, Pr(>Chisq(8)): 0
The following units have non-consecutive observations. Use runs.analysis() on output for additional details: 235.
Warning message:
The number of time periods used for the calculation of correlated SEs / PCSEs (18) is less than half the average number of time periods per panel (40.84). Consider setting complete.case=FALSE. 

Panel Regression with AR(1) Prais-Winsten correction and panel-corrected standard errors

Unbalanced Panel Design:                                                 
 Total obs.:       776 Avg obs. per panel 40.8421
 Number of panels: 19  Max obs. per panel 46     
 Number of times:  46  Min obs. per panel 19     

Coefficients:
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)              0.1215313  0.0885570   1.372   0.1704    
lag_milex_gdp            0.9317542  0.0183304  50.831   <2e-16 ***
GOV_rl                  -0.0016167  0.0008436  -1.916   0.0557 .  
gthreat                  0.0053543  0.0028483   1.880   0.0605 .  
GOV_min                  0.0363388  0.0317194   1.146   0.2523    
GOV_npty                 0.0087735  0.0110240   0.796   0.4264    
election_yr              0.0085757  0.0264547   0.324   0.7459    
lag_real_GDP_gr          0.6103200  0.5452297   1.119   0.2633    
cinclag                  1.3712229  2.2334560   0.614   0.5394    
lag_alliance             0.0194962  0.0357330   0.546   0.5855    
lag_cinc_ratio          -0.0313205  0.0658145  -0.476   0.6343    
lag_us_change_milex_gdp  0.0580540  0.0270741   2.144   0.0323 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-squared: 0.9238
Wald statistic: 5671.9587, Pr(>Chisq(11)): 0
         20         200         205         210         211         220 
 0.15750269  0.15528457 -0.05497798  0.14284367  0.29303502  0.12308926 
        225         230         235         305         325         350 
 0.05500530  0.05912737  0.16965981  0.17480153  0.20243282  0.26654458 
        375         380         385         390         640         900 
 0.11134871  0.08130810 -0.14279187 -0.44085962  0.11414091 -0.11631097 
        920 
 0.17101495 

Panel Regression with AR(1) Prais-Winsten correction and panel weighted least squares

Unbalanced Panel Design:                                                 
 Total obs.:       403 Avg obs. per panel 21.2105
 Number of panels: 19  Max obs. per panel 25     
 Number of times:  25  Min obs. per panel 2      

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     8.953114   2.416270   3.705 0.000242 ***
ldiffpeer       0.223179   0.242990   0.918 0.358953    
ldiffisi        0.110727   0.189322   0.585 0.558986    
ldiffgrowth    -0.005512   0.037860  -0.146 0.884323    
ldiffinflation -0.042805   0.040706  -1.052 0.293668    
ldiffneg        0.019432   0.038966   0.499 0.618276    
ldiffembi       0.001582   0.063225   0.025 0.980057    
limf            0.066433   0.058457   1.136 0.256486    
isi_objective   1.279863   0.334407   3.827 0.000151 ***
partisan        0.065469   0.036738   1.782 0.075538 .  
checks          0.037229   0.028710   1.297 0.195514    
lusffr         -0.006220   0.015679  -0.397 0.691812    
linflation     -0.087156   0.034060  -2.559 0.010886 *  
lbankra        -0.005701   0.002789  -2.044 0.041597 *  
lcab           -0.005822   0.008726  -0.667 0.505000    
lgrowth        -0.011393   0.005697  -2.000 0.046249 *  
ltradebalance   0.006412   0.008102   0.791 0.429174    
lngdpcap        0.230220   0.165544   1.391 0.165133    
lngdp          -0.545590   0.099243  -5.497 7.07e-08 ***
brk             0.225496   0.166099   1.358 0.175394    
timetrend       0.084515   0.019839   4.260 2.58e-05 ***
y1995           0.132824   0.125276   1.060 0.289702    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Wald statistic: 325.2936, Pr(>Chisq(21)): 0

panelAR documentation built on May 1, 2019, 8:19 p.m.