This vignette explains how to use the package panelhetero in R.

The package implements model-free procedures proposed by Okui and Yanagi (2019a, 2019b) to examine the degree of heterogeneity across cross-sectional units based on panel data.

Setting

Let ${ { y_{it} }{t=1}^T }{i=1}^N$ be balanced panel data where $y_{it}$ is a scalar random variable, $N$ is the number of cross-sectional units, and $T$ is the length of time series. We assume that the individual time series ${ y_{it} }_{t=1}^T$ is strictly stationary across time but heterogeneous acorss units.

To examine the properties of heterogeneity, we focus on estimating the cumulative distribution function (CDF), density, and moments (i.e., the means, variances, and correlations) of the heterogeneous mean, $k$-th autocovariance, and $k$-th autocorrelation:

$\mu_i = E(y_{it}|i)$, $\gamma_{k,i} = E[(y_{it} - \mu_i)(y_{i,t-k} - \mu_i)|i]$, and $\rho_{k,i} = \gamma_{k,i}/\gamma_{0,i}$,

where $E(\cdot|i)$ indicates the population mean for the individual time series. In words, $\mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$ are the population mean, $k$-th autocovariance, and $k$-th autocorrelation for the individual time series. Note that $\gamma_{0, i}$ is the variance. They are random variables as discussed in Okui and Yanagi (2019a, 2019b). We assume that they are i.i.d. across units.

The estimators of $\mu_i$, $\gamma_{k,i}$, $\rho_{k,i}$ are given by the sample analogues:

$\hat \mu_i = \frac{1}{T} \sum_{t=1}^T y_{it}$, $\hat \gamma_{k,i} = \frac{1}{T-k} \sum_{t=k+1}^T (y_{it} - \hat \mu_i)(y_{i,t-k} - \hat \mu_i)$, and $\hat \rho_{k,i} = \hat \gamma_{k,i} / \hat \gamma_{0,i}$.

For $\xi_i = \mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$, the parameters of interest are the CDF $F_{\xi}(\cdot) = \Pr(\xi_i \le \cdot)$, the density $f_{\xi}(\cdot)$, the mean $E(\xi_i)$, the variance $var(\xi_i)$, and the correlations such as $cor(\mu_i, \gamma_{k,i})$.

Okui and Yanagi (2019a) develop empirical distribution estimation of the CDF and the moments of the true $\mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$ based on the estimated $\hat \mu_i$, $\hat \gamma_{k,i}$, and $\hat \rho_{k,i}$. Okui and Yanagi (2019b) propose nonparametric kernel smoothing estimation for the density and the CDF of the true $\mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$ based on the estimated $\hat \mu_i$, $\hat \gamma_{k,i}$, and $\hat \rho_{k,i}$. Further, the papers develop half-panel jaccknife (HPJ) and third-order jackknife (TOJ) bias-correction methods to eliminate asymptotic biases of the estimators and cross-sectional boostrap procedures for statistical inferences. Note that HPJ and TOJ require that $T \ge 4$ and that $T \ge 6$, respectively. See the papers for more details.

panelhetero implements these procedures.

Functions

panelhetero contains the following functions.

In the following, we explain the arguments and the returned values of each function.

nemoment

nemoment implements the naive estimation for the moments without bias-correction. The parameters are the means, variances, and correlations:

$E(\mu_i)$, $E(\gamma_{k,i})$, $E(\rho_{k,i})$, $var(\mu_i)$, $var(\gamma_{k,i})$, $var(\rho_{k,i})$, $cor(\mu_i, \gamma_{k,i})$, $cor(\mu_i, \rho_{k,i})$, and $cor(\gamma_{k,i}, \rho_{k,i})$.

The usage is nemoment(data, acov_order = 0, acor_order = 1, R = 1000) where the arguments are as follows.

The returned value is a list that contains the following elements.

Note: The bootstrap procedure depends on random number generation. For reproducibility, it is highly recommended to set the seed of random number generator via set.seed before implementing nemoment.

hpjmoment

hpjmoment implements the HPJ bias-corrected estimation for the moments. The parameters are the same as those of nemoment. The usage is hpjmoment(data, acov_order = 0, acor_order = 1, R = 1000) where the arguments are the same as those of nemoment. The returned value is a list that contains the same elements as nemoment.

tojmoment

tojmoment implements the TOJ bias-corrected estimation for the moments. The parameters are the same as those of nemoment. The usage is tojmoment(data, acov_order = 0, acor_order = 1, R = 1000) where the arguments are the same as those of nemoment. The returned value is a list that contains the same elements as nemoment.

neecdf

neecdf implements the naive estimation for the empirical CDF without bias-correction. The parameters are the CDFs of $\mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$.

The usage is neecdf(data, acov_order = 0, acor_order = 1, R = 1000, ci = TRUE) where the arguments are as follows.

The returned value is a list that contains the following elements.

Note: In each graph, x-axis limits are set as the minium and maximum of the estimated quantities (e.g. the estimated means).

hpjecdf

hpjecdf implements the HPJ bias-corrected estimation for the empirical CDF. The parameters are the same as those of neecdf. The usage is hpjecdf(data, acov_order = 0, acor_order = 1, R = 1000, ci = TRUE) where the arguments are the same as those of neecdf. The returned value is a list that contains the same elements as neecdf.

Note: Since the HPJ and TOJ bias-corrected CDF estimators may be a non-monotonic function as a consequence of the bias-correction, we convert these bias-corrected estimators into monotonic functions based on Rearrangement.

Note: The functions mean_func, acov_func, and acor_func in hpjecdf and tojecdf are not subject to rearrangement, since Rearrangement cannot return functions. Nontheless, you can manually use Rearrangement to obtain rearranged estimates.

Note: Since the HPJ and TOJ bias-corrected CDF estimates may be smaller than 0 or greater than 1 as a consequence of the bias-correction, we impose the restriction that the estimates are no less than 0 and no larger than 1.

tojecdf

tojecdf implements the TOJ bias-corrected estimation for the empirical CDF. The parameters are the same as those of neecdf. The usage is tojecdf(data, acov_order = 0, acor_order = 1, R = 1000, ci = TRUE) where the arguments are the same as those of neecdf. The returned value is a list that contains the same elements as neecdf.

nekd

nekd implements the naive estimation for the kernel density without bias-correction. The parameters are the densities of $\mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$. The function uses the Gaussian kernel for the kernel density estimation.

The usage is nekd(data, acov_order = 0, acor_order = 1, mean_bw = NULL, acov_bw = NULL, acor_bw = NULL) where the arguments are as follows.

The returned value is a list that contains the following elements.

Note: In each graph, x-axis limits are set as the minium and maximum of the estimated quantities (e.g. the estimated means).

hpjkd

hpjkd implements the HPJ bias-corrected estimation for the kernel density. The parameters are the same as those of nekd. The usage is hpjkd(data, acov_order = 0, acor_order = 1, mean_bw = NULL, acov_bw = NULL, acor_bw = NULL) where the arguments are the same as those of nekd. The returned value is a list that contains the same elements as nekd.

Note: Since the HPJ and TOJ bias-corrected density estimates may be smaller than 0 as a consequence of the bias-correction, we impose the restriction that the estimates are no less than 0.

tojkd

tojkd implements the TOJ bias-corrected estimation for the kernel density. The parameters are the same as those of nekd. The usage is tojkd(data, acov_order = 0, acor_order = 1, mean_bw = NULL, acov_bw = NULL, acor_bw = NULL) where the arguments are the same as those of nekd. The returned value is a list that contains the same elements as nekd.

nekcdf

nekcdf implements the naive estimation for the kernel CDF without bias-correction. The parameters are the CDFs of $\mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$. The function uses the Gaussian CDF kernel function for the kernel CDF estimation.

The usage is nekcdf(data, acov_order = 0, acor_order = 1, mean_bw = NULL, acov_bw = NULL, acor_bw = NULL) where the arguments are as follows.

The returned value is a list that contains the following elements.

Note: In each graph, x-axis limits are set as the minium and maximum of the estimated quantities (e.g. the estimated means).

hpjkcdf

hpjkcdf implements the HPJ bias-corrected estimation for the kernel CDF. The parameters are the same as those of nekcdf. The usage is hpjkcdf(data, acov_order = 0, acor_order = 1, mean_bw = NULL, acov_bw = NULL, acor_bw = NULL) where the arguments are the same as those of nekcdf. The returned value is a list that contains the same elements as nekcdf.

Note: Since the HPJ and TOJ bias-corrected CDF estimators may be non-monotonic functions as a consequence of the bias-correction, we convert these bias-corrected estimators into monotonic functions based on Rearrangement.

Note: The functions mean_func, acov_func, and acor_func in hpjkcdf and tojkcdf are not subject to rearrangement, since Rearrangement cannot return functions. Nontheless, you can manually use Rearrangement to obtain rearranged estimates.

Note: Since the HPJ and TOJ bias-corrected CDF estimates may be smaller than 0 or greater than 1 as a consequence of the bias-correction, we impose the restriction that the estimates are no less than 0 and no larger than 1.

tojkcdf

tojkcdf implements the TOJ bias-corrected estimation for the kernel CDF. The parameters are the same as those of nekcdf. The usage is tojkcdf(data, acov_order = 0, acor_order = 1, mean_bw = NULL, acov_bw = NULL, acor_bw = NULL) where the arguments are the same as those of nekcdf. The returned value is a list that contains the same elements as nekcdf.

Example

We illustrate the use of panelhetero through the following simple example. We generate simulated data based on AR(1) model with heterogeneous random coefficients.

set.seed(101)

# sample sizes
N <- 300
S <- 50

# heterogeneous random coefficients
c <- rnorm(N, mean = 0, sd = 1)
phi <- rbeta(N, shape1 = 2, shape2 = 5)
sigma <- rbeta(N, shape1 = 2, shape2 = 5) + 0.5

# initializing N * (T + 1) data matrix including intial values
y <- matrix(0, nrow = N, ncol = S + 1)

# generated by AR(1) model
for(i in 1:N) {

  # unobserved initial value
  y[i,1] <- rnorm(1, mean = c[i], sd = sigma[i])

  for(t in 2:(S+1)) {
    y[i, t] <- (1 - phi[i]) * c[i] + phi[i] * y[i, t-1] + 
      sqrt((1 - phi[i])^2 * sigma[i]^2) * rnorm(1, mean = 0, sd = 1)
  }
}

# N * T data matrix
y <- y[, -1]

We can estimate the moments of the heterogeneous mean, autocovariance, and autocorrelation via nemoment, hpjmoment, and tojmoment. For example, we can implement the HPJ bias-corrected estimation for the moments as follows.

library("panelhetero")

set.seed(101)

r1 <- hpjmoment(data = y, acov_order = 0, acor_order = 1)

r1$estimate

r1$se

r1$ci

We can also implement the empirical and kernel CDF estimation and the kernel density estimation via neecdf, hpjecdf, tojecdf, nekd, hpjkd, tojkd, nekcdf, hpjkcdf, and tojkcdf. For example, we can estimate the density of $\mu_i$ based on the HPJ bias-corrected kernel density estimation as follows.

r2 <- hpjkd(data = y, acov_order = 0, acor_order = 1)

r2$mean

The graph is plotted via ggplot2, so that we can customize it through standard commands for ggplot2 (e.g., customizing the title, background, axes, and labels). For example, we can customize the title and the theme as follows.

library(ggplot2)

r2$mean + ggtitle("The density of the mean") + theme_classic()

Also, neecdf, hpjecdf, tojecdf, nekd, hpjkd, tojkd, nekcdf, hpjkcdf, and tojkcdf return the functions mean_func, acov_func, and acor_func that return values of the estimated functions. By using these functions, you can plot graphs on your own. For example, the graph of the HPJ bias-corrected density for the mean can be plotted as follows.

ggplot(data = data.frame(x = c(-3.5, 3.5)), aes(x = x)) + 
  stat_function(fun = r2$mean_func) + ggtitle("The density of the mean")

This usage is useful especially when you would like to plot a graph with a specified range or when you would like to overlay several graphs.

See also

ggplot2, kerdiest, KernSmooth, Rearrangement,

References

Ryo Okui, Takahide Yanagi (2019a): "Panel Data Analysis with Heterogeneous Dynamics", Journal of Econometrics, 212(2), 451-475.

Ryo Okui, Takahide Yanagi (2019b): "Kernel Estimation for Panel Data with Heterogeneous Dynamics", Forthcoming in The Econometrics Journal.



tkhdyanagi/panelhetero documentation built on Dec. 29, 2019, 4:40 a.m.