This vignette explains how to use the package **panelhetero** in **R**.

The package implements model-free procedures proposed by Okui and Yanagi (2019a, 2019b) to examine the degree of heterogeneity across cross-sectional units based on panel data.

Let ${ { y_{it} }*{t=1}^T }*{i=1}^N$ be balanced panel data where $y_{it}$ is a scalar random variable, $N$ is the number of cross-sectional units, and $T$ is the length of time series.
We assume that the individual time series ${ y_{it} }_{t=1}^T$ is strictly stationary across time but heterogeneous acorss units.

To examine the properties of heterogeneity, we focus on estimating the cumulative distribution function (CDF), density, and moments (i.e., the means, variances, and correlations) of the heterogeneous mean, $k$-th autocovariance, and $k$-th autocorrelation:

$\mu_i = E(y_{it}|i)$, $\gamma_{k,i} = E[(y_{it} - \mu_i)(y_{i,t-k} - \mu_i)|i]$, and $\rho_{k,i} = \gamma_{k,i}/\gamma_{0,i}$,

where $E(\cdot|i)$ indicates the population mean for the individual time series. In words, $\mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$ are the population mean, $k$-th autocovariance, and $k$-th autocorrelation for the individual time series. Note that $\gamma_{0, i}$ is the variance. They are random variables as discussed in Okui and Yanagi (2019a, 2019b). We assume that they are i.i.d. across units.

The estimators of $\mu_i$, $\gamma_{k,i}$, $\rho_{k,i}$ are given by the sample analogues:

$\hat \mu_i = \frac{1}{T} \sum_{t=1}^T y_{it}$, $\hat \gamma_{k,i} = \frac{1}{T-k} \sum_{t=k+1}^T (y_{it} - \hat \mu_i)(y_{i,t-k} - \hat \mu_i)$, and $\hat \rho_{k,i} = \hat \gamma_{k,i} / \hat \gamma_{0,i}$.

For $\xi_i = \mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$, the parameters of interest are the CDF $F_{\xi}(\cdot) = \Pr(\xi_i \le \cdot)$, the density $f_{\xi}(\cdot)$, the mean $E(\xi_i)$, the variance $var(\xi_i)$, and the correlations such as $cor(\mu_i, \gamma_{k,i})$.

Okui and Yanagi (2019a) develop empirical distribution estimation of the CDF and the moments of the true $\mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$ based on the estimated $\hat \mu_i$, $\hat \gamma_{k,i}$, and $\hat \rho_{k,i}$. Okui and Yanagi (2019b) propose nonparametric kernel smoothing estimation for the density and the CDF of the true $\mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$ based on the estimated $\hat \mu_i$, $\hat \gamma_{k,i}$, and $\hat \rho_{k,i}$. Further, the papers develop half-panel jaccknife (HPJ) and third-order jackknife (TOJ) bias-correction methods to eliminate asymptotic biases of the estimators and cross-sectional boostrap procedures for statistical inferences. Note that HPJ and TOJ require that $T \ge 4$ and that $T \ge 6$, respectively. See the papers for more details.

**panelhetero** implements these procedures.

**panelhetero** contains the following functions.

`nemoment`

: naive estimation for the means, variances, and correlations without bias-correction`hpjmoment`

: HPJ bias-corrected estimation for the means, variances, and correlations`tojmoment`

: TOJ bias-corrected estimation for the means, variances, and correlations`neecdf`

: naive estimation for the empirical CDFs without bias-correction`hpjecdf`

: HPJ bias-corrected estimation for the empirical CDFs`tojecdf`

: TOJ bias-corrected estimation for the empirical CDFs`nekd`

: naive estimation for the kernel density without bias-correction`hpjkd`

: HPJ bias-corrected estimation for the kernel density`tojkd`

: TOJ bias-corrected estimation for the kernel density`nekcdf`

: naive estimation for the kernel CDF without bias-correction`hpjkcdf`

: HPJ bias-corrected estimation for the kernel CDF`tojkcdf`

: TOJ bias-corrected estimation for the kernel CDF

In the following, we explain the arguments and the returned values of each function.

`nemoment`

`nemoment`

implements the naive estimation for the moments without bias-correction.
The parameters are the means, variances, and correlations:

$E(\mu_i)$, $E(\gamma_{k,i})$, $E(\rho_{k,i})$, $var(\mu_i)$, $var(\gamma_{k,i})$, $var(\rho_{k,i})$, $cor(\mu_i, \gamma_{k,i})$, $cor(\mu_i, \rho_{k,i})$, and $cor(\gamma_{k,i}, \rho_{k,i})$.

The usage is `nemoment(data, acov_order = 0, acor_order = 1, R = 1000)`

where the arguments are as follows.

`data`

: $N \times T$ matrix for panel data in which each row is individual time series.`acov_order`

: non-negative integer $k$ for the order of the autocovariance. The default is`acov_order = 0`

.`acor_order`

: positive integer $k$ for the order of the autocorrelation. The default is`acor_order = 1`

.`R`

: positive integer for the number of bootstrap replications. The default is`R = 1000`

.

The returned value is a list that contains the following elements.

`estimate`

: estimates for the parameters without bias-correction`se`

: standard errors for the estimators based on the cross-sectional bootstrap`ci`

: 95\% confidence intervals for the parameters based on the cross-sectional bootstrap`quantity`

: matrix that contains the estimated means, autocovariances, and autocorrelations`acov_order`

: the same as the argument`acor_order`

: the same as the argument`N`

: the number of cross-sectional units`S`

: the length of time series`R`

: the number of bootstrap replications

Note: The bootstrap procedure depends on random number generation.
For reproducibility, it is highly recommended to set the seed of random number generator via `set.seed`

before implementing `nemoment`

.

`hpjmoment`

`hpjmoment`

implements the HPJ bias-corrected estimation for the moments.
The parameters are the same as those of `nemoment`

.
The usage is `hpjmoment(data, acov_order = 0, acor_order = 1, R = 1000)`

where the arguments are the same as those of `nemoment`

.
The returned value is a list that contains the same elements as `nemoment`

.

`tojmoment`

`tojmoment`

implements the TOJ bias-corrected estimation for the moments.
The parameters are the same as those of `nemoment`

.
The usage is `tojmoment(data, acov_order = 0, acor_order = 1, R = 1000)`

where the arguments are the same as those of `nemoment`

.
The returned value is a list that contains the same elements as `nemoment`

.

`neecdf`

`neecdf`

implements the naive estimation for the empirical CDF without bias-correction.
The parameters are the CDFs of $\mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$.

The usage is `neecdf(data, acov_order = 0, acor_order = 1, R = 1000, ci = TRUE)`

where the arguments are as follows.

`data`

: $N \times T$ matrix for panel data in which each row is individual time series.`acov_order`

: non-negative integer $k$ for the order of autocovariance. The default is`acov_order = 0`

.`acor_order`

: positive integer $k$ for the order of autocorrelation. The default is`acor_order = 1`

.`R`

: positive integer for the number of bootstrap replications. The default is`R = 1000`

.`ci`

: bool for the confidence interval. The default is TRUE.

The returned value is a list that contains the following elements.

`mean`

: graph of the empirical CDF estimation for the mean generated by`stat_function`

in**ggplot2**`acov`

: graph of the empirical CDF estimation for the autocovariance generated by`stat_function`

in**ggplot2**`acor`

: graph of the empirical CDF estimation for the autocorrelation generated by`stat_function`

in**ggplot2**`mean_func`

: function that returns naive empirical CDF estimates for the mean`acov_func`

: function that returns naive empirical CDF estimates for the autocovariance`acor_func`

: function that returns naive empirical CDF estimates for the autocorrelation`mean_ci_func`

: function that returns 95\% bootstrap confidence interval for naive empirical CDF estimates for the mean`acov_ci_func`

: function that returns 95\% bootstrap confidence interval for naive empirical CDF estimates for the autocovariance`acor_ci_func`

: function that returns 95\% bootstrap confidence interval for naive empirical CDF estimates for the autocorrelation`quantity`

: matrix that contains the estimated means, autocovariances, and autocorrelations`acov_order`

: the same as the argument`acor_order`

: the same as the argument`N`

: the number of cross-sectional units`S`

: the length of time series`R`

: the number of bootstrap replications

Note: In each graph, x-axis limits are set as the minium and maximum of the estimated quantities (e.g. the estimated means).

`hpjecdf`

`hpjecdf`

implements the HPJ bias-corrected estimation for the empirical CDF.
The parameters are the same as those of `neecdf`

.
The usage is `hpjecdf(data, acov_order = 0, acor_order = 1, R = 1000, ci = TRUE)`

where the arguments are the same as those of `neecdf`

.
The returned value is a list that contains the same elements as `neecdf`

.

Note: Since the HPJ and TOJ bias-corrected CDF estimators may be a non-monotonic function as a consequence of the bias-correction, we convert these bias-corrected estimators into monotonic functions based on **Rearrangement**.

Note: The functions `mean_func`

, `acov_func`

, and `acor_func`

in `hpjecdf`

and `tojecdf`

are not subject to rearrangement, since **Rearrangement** cannot return functions.
Nontheless, you can manually use **Rearrangement** to obtain rearranged estimates.

Note: Since the HPJ and TOJ bias-corrected CDF estimates may be smaller than 0 or greater than 1 as a consequence of the bias-correction, we impose the restriction that the estimates are no less than 0 and no larger than 1.

`tojecdf`

`tojecdf`

implements the TOJ bias-corrected estimation for the empirical CDF.
The parameters are the same as those of `neecdf`

.
The usage is `tojecdf(data, acov_order = 0, acor_order = 1, R = 1000, ci = TRUE)`

where the arguments are the same as those of `neecdf`

.
The returned value is a list that contains the same elements as `neecdf`

.

`nekd`

`nekd`

implements the naive estimation for the kernel density without bias-correction.
The parameters are the densities of $\mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$.
The function uses the Gaussian kernel for the kernel density estimation.

The usage is `nekd(data, acov_order = 0, acor_order = 1, mean_bw = NULL, acov_bw = NULL, acor_bw = NULL)`

where the arguments are as follows.

`data`

: $N \times T$ matrix for panel data in which each row is individual time series.`acov_order`

: non-negative integer $k$ for the order of autocovariance. The default is`acov_order = 0`

.`acor_order`

: positive integer $k$ for the order of autocorrelation. The default is`acor_order = 1`

.`mean_bw`

: bandwidth for the density estimation of the mean. The default is selected as the plug-in bandwidth by`dpik`

in**KernSmooth**.`acov_bw`

: bandwidth for the density estimation of the autocovariance. The default is selected as the plug-in bandwidth by`dpik`

in**KernSmooth**.`acor_bw`

: bandwidth for the density estimation of the autocorrelation. The default is selected as the plug-in bandwidth by`dpik`

in**KernSmooth**.

The returned value is a list that contains the following elements.

`mean`

: graph of the naive kernel density estimation for the mean generated by`stat_function`

in**ggplot2**`acov`

: graph of the naive kernel density estimation for the autocovariance generated by`stat_function`

in**ggplot2**`acor`

: graph of the naive kernel density estimation for the autocorrelation generated by`stat_function`

in**ggplot2**`mean_func`

: function that returns naive kernel density estimates for the mean`acov_func`

: function that returns naive kernel density estimates for the autocovariance`acor_func`

: function that returns naive kernel density estimates for the autocorrelation`bandwidth`

: the selected bandwidths`quantity`

: matrix that contains the estimated means, autocovariances, and autocorrelations`acov_order`

: the same as the argument`acor_order`

: the same as the argument`N`

: the number of cross-sectional units`S`

: the length of time series

Note: In each graph, x-axis limits are set as the minium and maximum of the estimated quantities (e.g. the estimated means).

`hpjkd`

`hpjkd`

implements the HPJ bias-corrected estimation for the kernel density.
The parameters are the same as those of `nekd`

.
The usage is `hpjkd(data, acov_order = 0, acor_order = 1, mean_bw = NULL, acov_bw = NULL, acor_bw = NULL)`

where the arguments are the same as those of `nekd`

.
The returned value is a list that contains the same elements as `nekd`

.

Note: Since the HPJ and TOJ bias-corrected density estimates may be smaller than 0 as a consequence of the bias-correction, we impose the restriction that the estimates are no less than 0.

`tojkd`

`tojkd`

implements the TOJ bias-corrected estimation for the kernel density.
The parameters are the same as those of `nekd`

.
The usage is `tojkd(data, acov_order = 0, acor_order = 1, mean_bw = NULL, acov_bw = NULL, acor_bw = NULL)`

where the arguments are the same as those of `nekd`

.
The returned value is a list that contains the same elements as `nekd`

.

`nekcdf`

`nekcdf`

implements the naive estimation for the kernel CDF without bias-correction.
The parameters are the CDFs of $\mu_i$, $\gamma_{k,i}$, and $\rho_{k,i}$.
The function uses the Gaussian CDF kernel function for the kernel CDF estimation.

The usage is `nekcdf(data, acov_order = 0, acor_order = 1, mean_bw = NULL, acov_bw = NULL, acor_bw = NULL)`

where the arguments are as follows.

`data`

: $N \times T$ matrix for panel data in which each row is individual time series.`acov_order`

: non-negative integer $k$ for the order of autocovariance. The default is`acov_order = 0`

.`acor_order`

: positive integer $k$ for the order of autocorrelation. The default is`acor_order = 1`

.`mean_bw`

: bandwidth for the CDF estimation of the mean. The default is selected as the plug-in bandwidth by`PBbw`

in**kerdiest**.`acov_bw`

: bandwidth for the CDF estimation of the autocovariance. The default is selected as the plug-in bandwidth by`PBbw`

in**kerdiest**.`acor_bw`

: bandwidth for the CDF estimation of the autocorrelation. The default is selected as the plug-in bandwidth by`PBbw`

in**kerdiest**.

The returned value is a list that contains the following elements.

`mean`

: graph of the naive kernel CDF estimation for the mean generated by`stat_function`

in**ggplot2**.`acov`

: graph of the naive kernel CDF estimation for the autocovariance generated by`stat_function`

in**ggplot2**`acor`

: graph of the naive kernel CDF estimation for the autocorrelation generated by`stat_function`

in**ggplot2**`mean_func`

: function that returns naive kernel CDF estimates for the mean`acov_func`

: function that returns naive kernel CDF estimates for the autocovariance`acor_func`

: function that returns naive kernel CDF estimates for the autocorrelation`bandwidth`

: the selected bandwidths`quantity`

: matrix that contains the estimated means, autocovariances, and autocorrelations`acov_order`

: the same as the argument`acor_order`

: the same as the argument`N`

: the number of cross-sectional units`S`

: the length of time series

Note: In each graph, x-axis limits are set as the minium and maximum of the estimated quantities (e.g. the estimated means).

`hpjkcdf`

`hpjkcdf`

implements the HPJ bias-corrected estimation for the kernel CDF.
The parameters are the same as those of `nekcdf`

.
The usage is `hpjkcdf(data, acov_order = 0, acor_order = 1, mean_bw = NULL, acov_bw = NULL, acor_bw = NULL)`

where the arguments are the same as those of `nekcdf`

.
The returned value is a list that contains the same elements as `nekcdf`

.

Note: Since the HPJ and TOJ bias-corrected CDF estimators may be non-monotonic functions as a consequence of the bias-correction, we convert these bias-corrected estimators into monotonic functions based on **Rearrangement**.

Note: The functions `mean_func`

, `acov_func`

, and `acor_func`

in `hpjkcdf`

and `tojkcdf`

are not subject to rearrangement, since **Rearrangement** cannot return functions.
Nontheless, you can manually use **Rearrangement** to obtain rearranged estimates.

Note: Since the HPJ and TOJ bias-corrected CDF estimates may be smaller than 0 or greater than 1 as a consequence of the bias-correction, we impose the restriction that the estimates are no less than 0 and no larger than 1.

`tojkcdf`

`tojkcdf`

implements the TOJ bias-corrected estimation for the kernel CDF.
The parameters are the same as those of `nekcdf`

.
The usage is `tojkcdf(data, acov_order = 0, acor_order = 1, mean_bw = NULL, acov_bw = NULL, acor_bw = NULL)`

where the arguments are the same as those of `nekcdf`

.
The returned value is a list that contains the same elements as `nekcdf`

.

We illustrate the use of **panelhetero** through the following simple example.
We generate simulated data based on AR(1) model with heterogeneous random coefficients.

set.seed(101) # sample sizes N <- 300 S <- 50 # heterogeneous random coefficients c <- rnorm(N, mean = 0, sd = 1) phi <- rbeta(N, shape1 = 2, shape2 = 5) sigma <- rbeta(N, shape1 = 2, shape2 = 5) + 0.5 # initializing N * (T + 1) data matrix including intial values y <- matrix(0, nrow = N, ncol = S + 1) # generated by AR(1) model for(i in 1:N) { # unobserved initial value y[i,1] <- rnorm(1, mean = c[i], sd = sigma[i]) for(t in 2:(S+1)) { y[i, t] <- (1 - phi[i]) * c[i] + phi[i] * y[i, t-1] + sqrt((1 - phi[i])^2 * sigma[i]^2) * rnorm(1, mean = 0, sd = 1) } } # N * T data matrix y <- y[, -1]

We can estimate the moments of the heterogeneous mean, autocovariance, and autocorrelation via `nemoment`

, `hpjmoment`

, and `tojmoment`

.
For example, we can implement the HPJ bias-corrected estimation for the moments as follows.

library("panelhetero") set.seed(101) r1 <- hpjmoment(data = y, acov_order = 0, acor_order = 1) r1$estimate r1$se r1$ci

We can also implement the empirical and kernel CDF estimation and the kernel density estimation via `neecdf`

, `hpjecdf`

, `tojecdf`

, `nekd`

, `hpjkd`

, `tojkd`

, `nekcdf`

, `hpjkcdf`

, and `tojkcdf`

.
For example, we can estimate the density of $\mu_i$ based on the HPJ bias-corrected kernel density estimation as follows.

r2 <- hpjkd(data = y, acov_order = 0, acor_order = 1) r2$mean

The graph is plotted via **ggplot2**, so that we can customize it through standard commands for **ggplot2** (e.g., customizing the title, background, axes, and labels).
For example, we can customize the title and the theme as follows.

library(ggplot2) r2$mean + ggtitle("The density of the mean") + theme_classic()

Also, `neecdf`

, `hpjecdf`

, `tojecdf`

, `nekd`

, `hpjkd`

, `tojkd`

, `nekcdf`

, `hpjkcdf`

, and `tojkcdf`

return the functions `mean_func`

, `acov_func`

, and `acor_func`

that return values of the estimated functions.
By using these functions, you can plot graphs on your own.
For example, the graph of the HPJ bias-corrected density for the mean can be plotted as follows.

ggplot(data = data.frame(x = c(-3.5, 3.5)), aes(x = x)) + stat_function(fun = r2$mean_func) + ggtitle("The density of the mean")

This usage is useful especially when you would like to plot a graph with a specified range or when you would like to overlay several graphs.

ggplot2, kerdiest, KernSmooth, Rearrangement,

Ryo Okui, Takahide Yanagi (2019a): "Panel Data Analysis with Heterogeneous Dynamics", Journal of Econometrics, 212(2), 451-475.

Ryo Okui, Takahide Yanagi (2019b): "Kernel Estimation for Panel Data with Heterogeneous Dynamics", Forthcoming in The Econometrics Journal.

tkhdyanagi/panelhetero documentation built on Dec. 29, 2019, 4:40 a.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.