iea_file_OK: Perform quality assurance on a raw IEA data file

View source: R/initialize.R

iea_file_OKR Documentation

Perform quality assurance on a raw IEA data file

Description

When starting to work with an IEA data file, it is important to verify its integrity. This function performs some validation tests on .iea_file.

Usage

iea_file_OK(
  .iea_file = NULL,
  text = NULL,
  expected_1st_line_start = ",,TIME",
  expected_2nd_line_start = "COUNTRY,FLOW,PRODUCT",
  expected_simple_start = expected_2nd_line_start,
  .slurped_iea_df = NULL,
  country = "COUNTRY",
  flow = "FLOW",
  product = "PRODUCT",
  rowid = "rowid"
)

Arguments

.iea_file

the path to the raw IEA data file for which quality assurance is desired

text

a string containing text to be parsed as an IEA file.

expected_1st_line_start

the expected start of the first line of iea_file. Default is ",,TIME".

expected_2nd_line_start

the expected start of the second line of iea_file. Default is "COUNTRY,FLOW,PRODUCT".

expected_simple_start

the expected starting of the first line of iea_file. Default is the value of expected_2nd_line_start. Note that expected_simple_start is sometimes encountered in data supplied by the IEA. Furthermore, expected_simple_start could be the format of the file when somebody "helpfully" fiddles with the raw data from the IEA.

.slurped_iea_df

a data frame created by slurp_iea_to_raw_df()

country

the name of the country column. Default is "COUNTRY".

flow

the name of the flow column. Default is "FLOW".

product

the name of the product column. Default is "PRODUCT".

rowid

the name of a row number column added internally to .iea_file per country. Default is "rowid".

Details

At this time, the only verification step performed by this function is confirming that every country has the same flow and product rows in the same order. The approach is to add a per-country row number column to the data frame and delete all the data in year columns. Then, the resulting data frame is queried for duplicate row numbers. If none are found, the function returns the data frame read from the file.

Note that .iea_file is read internally with data.table::fread() without stripping white space.

If .slurped_iea_df is supplied, arguments .iea_file or text are ignored. If .slurped_iea_df is absent, either .iea_file or text are required, and the helper function slurp_iea_to_raw_df() is called internally to load a raw data frame of data.

Value

TRUE if .iea_file passes all checks. Errors are thrown when a verification step fails.

Examples

library(magrittr)
sample_iea_data_path() %>% 
 iea_file_OK()

MatthewHeun/IEATools documentation built on Feb. 6, 2024, 3:29 p.m.