pdata.frame: data.frame for panel data
In plm: Linear Models for Panel Data

Description Usage Arguments Details Value Author(s) See Also Examples

An object of class 'pdata.frame' is a data.frame with an index attribute that describes its individual and time dimensions.

pdata.frame(
  x,
  index = NULL,
  drop.index = FALSE,
  row.names = TRUE,
  stringsAsFactors = default.stringsAsFactors(),
  replace.non.finite = FALSE,
  drop.NA.series = FALSE,
  drop.const.series = FALSE,
  drop.unused.levels = FALSE
)

## S3 replacement method for class 'pdata.frame'
x$name <- value

## S3 method for class 'pdata.frame'
x[i, j, drop]

## S3 method for class 'pdata.frame'
x[[y]]

## S3 method for class 'pdata.frame'
x$y

## S3 method for class 'pdata.frame'
print(x, ...)

## S3 method for class 'pdata.frame'
as.list(x, keep.attributes = FALSE, ...)

## S3 method for class 'pdata.frame'
as.data.frame(
  x,
  row.names = NULL,
  optional = FALSE,
  keep.attributes = TRUE,
  ...
)

`x`	a `data.frame` for the `pdata.frame` function and a `pdata.frame` for the methods,
`index`	this argument indicates the individual and time indexes. See Details,
`drop.index`	logical, indicates whether the indexes are to be excluded from the resulting pdata.frame,
`row.names`	`NULL` or logical, indicates whether "fancy" row names (combination of individual index and time index) are to be added to the returned (p)data.frame (`NULL` and `FALSE` have the same meaning for `pdata.frame`; for `as.data.frame.pdata.frame` see Details),
`stringsAsFactors`	logical, indicating whether character vectors are to be converted to factors,
`replace.non.finite`	logical, indicating whether values for which `is.finite()` yields `TRUE` are to be replaced by `NA` values, except for character variables (defaults to `FALSE`),
`drop.NA.series`	logical, indicating whether all-NA columns are to be removed from the pdata.frame (defaults to `FALSE`),
`drop.const.series`	logical, indicating whether constant columns are to be removed from the pdata.frame (defaults to `FALSE`),
`drop.unused.levels`	logical, indicating whether unused levels of factors are to be dropped (defaults to `FALSE`) (unused levels are always dropped from variables serving to construct the index variables),
`name`	the name of the `data.frame`,
`value`	the name of the variable to include,
`i`	see `Extract()`,
`j`	see `Extract()`,
`drop`	see `Extract()`,
`y`	one of the columns of the `data.frame`,
`...`	further arguments.
`keep.attributes`	logical, only for as.list and as.data.frame methods, indicating whether the elements of the returned list/columns of the data.frame should have the pdata.frame's attributes added (default: FALSE for as.list, TRUE for as.data.frame),
`optional`	see `as.data.frame()`,

The index argument indicates the dimensions of the panel. It can be:

a vector of two character strings which contains the names of the individual and of the time indexes,
a character string which is the name of the individual index variable. In this case, the time index is created automatically and a new variable called "time" is added, assuming consecutive and ascending time periods in the order of the original data,
an integer, the number of individuals. In this case, the data need to be a balanced panel and be organized as a stacked time series (successive blocks of individuals, each block being a time series for the respective individual) assuming consecutive and ascending time periods in the order of the original data. Two new variables are added: "id" and "time" which contain the individual and the time indexes.

The "[[" and "$" extract a series from the pdata.frame. The "index" attribute is then added to the series and a class attribute "pseries" is added. The "[" method behaves as for data.frame, except that the extraction is also applied to the index attribute. A safe way to extract the index attribute is to use the function index() for 'pdata.frames' (and other objects).

as.data.frame removes the index attribute from the pdata.frame and adds it to each column. For its argument row.names set to FALSE row names are an integer series, TRUE gives "fancy" row names; if a character (with length of the resulting data frame), the row names will be the character's elements.

as.list behaves by default identical to base::as.list.data.frame() which means it drops the attributes specific to a pdata.frame; if a list of pseries is wanted, the attribute keep.attributes can to be set to TRUE. This also makes lapply work as expected on a pdata.frame (see also Examples).

a pdata.frame object: this is a data.frame with an index attribute which is a data.frame with two variables, the individual and the time indexes, both being factors. The resulting pdata.frame is sorted by the individual index, then by the time index.

Yves Croissant

index() to extract the index variables from a 'pdata.frame' (and other objects), pdim() to check the dimensions of a 'pdata.frame' (and other objects), pvar() to check for each variable if it varies cross-sectionally and over time. To check if the time periods are consecutive per individual, see is.pconsecutive().

# Gasoline contains two variables which are individual and time
# indexes
data("Gasoline", package = "plm")
Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE)

# Hedonic is an unbalanced panel, townid is the individual index
data("Hedonic", package = "plm")
Hed <- pdata.frame(Hedonic, index = "townid", row.names = FALSE)

# In case of balanced panel, it is sufficient to give number of
# individuals data set 'Wages' is organized as a stacked time
# series
data("Wages", package = "plm")
Wag <- pdata.frame(Wages, 595)

# lapply on a pdata.frame by making it a list of pseries first
lapply(as.list(Wag[ , c("ed", "lwage")], keep.attributes = TRUE), lag)