pp_clean: Tidy a dataset with a "long" policy portfolio structure

View source: R/pp_clean.R

pp_cleanR Documentation

Tidy a dataset with a "long" policy portfolio structure

Description

Clean a policy portfolio dataset into a tidy object.

Usage

pp_clean(
  d,
  Sector = NULL,
  Country.name = "Country",
  Year.name = "Year",
  Instrument.name = "Instrument",
  Target.name = "Target",
  coding.category.name = "Coding category",
  coding.category = 2,
  Direction.name = "Direction",
  directions = c(0, 1, -1),
  associated.vars = NULL,
  date = FALSE,
  debug = FALSE
)

Arguments

d

Data frame in an uncleaned and untidy structure containing data from a policy portfolio.

Sector

Character vector with the Sector of the dataset.

Country.name

Character vector of length one with the name of the variable that contains the country name.

Year.name

Character vector of length one with the name of the variable that contains the year.

Instrument.name

Character vector of length one with the name of the variable that contains the instruments.

Target.name

Character vector of length one with the name of the variable that contains the targets.

coding.category.name

Character vector of length one with the name of the variable that contains the coding category.

coding.category

Numerical value with the level of the category that captures the combination of instrument and target.

Direction.name

Character vector of length one with the name of the variable that contains the direction of the policy change.

directions

Numerical vector with the numeric values of the direction of the policy changes, namely "Status quo", "Expansion" and "Dismantling". Defaults to, 0, 1 and -1, respectively.

associated.vars

Character vector indicating variables that contain characteristics of the policy space.

date

By default, return Year as the only time indicator. If TRUE, return the full date with dd-mm-YYYY.

debug

Logical value. When TRUE, print more verbose information about the cleaning process.

Value

D Data frame in a tidy format with the following columns: "Country", "Sector", "Year", "Instrument", "Target" and "covered". "covered" is a binary identificator of whether the portfolio space is covered by policy intervention (1) or not (0). The remaining columns identify the case. Notice that "Year" is a numeric value, while the remaining 4 case identifiers are factors.

Examples

## Not run: 
X <- read.table("raw_data.csv", header = TRUE)
D <- pp_clean(X, Sector = "Education")

# Now 'D' is a tidy dataset suitable for being used in the context of the 'PolicyPortfolio' package.

## End(Not run)

PolicyPortfolios documentation built on March 18, 2022, 5:36 p.m.