knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Most of the functions in this package require a path to a config.yml
file as input. This structure allows for an easily reviewed file that contains all relevant parameters that should be / were used for a given multi-loanbook analysis. Below is a full documentation of each option.
The config file is separated into a few top-level "sections" that contain contextually similar options. The top-level sections will be documented as well below, but note that the top-level sections themselves never have a value directly associated with them.
Also note that the config file must have the top-level section default
. This is related to a feature of the yaml
package which facilitates having and targeting different config sets for different purposes. Technically, one could leverage this for use with pacta.multi.loanbook, but it is not recommended.
The directories
{.yaml} section contains options to define locally accessible paths where input and output data should be found or saved. A full example directories
{.yaml} section might look like:
directories: dir_input: "~/Desktop/test/input" dir_prepared_abcd: "~/Desktop/test/prepared_abcd" dir_matched_loanbooks: "~/Desktop/test/matched_loanbooks" dir_prioritized_loanbooks_and_diagnostics: "~/Desktop/test/prioritized_loanbooks_and_diagnostics" dir_analysis: "~/Desktop/test/analysis"
dir_input
{.yaml} is a path to a directory that contains all input data to be used. Input data is any data set that must be produced or obtained externally by the user and that is not the output of any of the functions in this package. This includes files only needed optionally. It must be a single string/character value, and it must refer to a valid, accessible, local directory. As an example:
dir_input: "~/Desktop/test/input"
dir_prepared_abcd
{.yaml} is a path to a directory where the outputs of the function prepare_abcd()
should be saved. It must be a single string/character value, and it must refer to a valid, accessible, local directory. As an example:
dir_prepared_abcd: "~/Desktop/test/prepared_abcd"
dir_matched_loanbooks
{.yaml} is a path to a directory where the outputs of the function match_loanbooks()
should be saved. It must be a single string/character value, and it must refer to a valid, accessible, local directory. As an example:
dir_matched_loanbooks: "~/Desktop/test/matched_loanbooks"
dir_prioritized_loanbooks_and_diagnostics
{.yaml} is a path to a directory where the outputs of the function prioritise_and_diagnose()
should be saved. It must be a single string/character value, and it must refer to a valid, accessible, local directory. As an example:
dir_prioritized_loanbooks_and_diagnostics: "~/Desktop/test/prioritized_loanbooks_and_diagnostics"
dir_analysis
{.yaml} is a path to a directory where the outputs of the function analyse()
should be saved. It must be a single string/character value, and it must refer to a valid, accessible, local directory. As an example:
dir_analysis: "~/Desktop/test/analysis"
The file_names
{.yaml} section contains options to define the file names of locally accessible files found in the directories defined in the directories
{.yaml} section. The directories and file names are defined separately to allow for flexibility in where and how your input and output files are stored. A full example file_names
{.yaml} section might look like:
file_names: filename_scenario_tms: "scenarios_2022_p4b.csv" filename_scenario_sda: "scenarios_2022_ei_p4b.csv" filename_abcd: "2023-02-17_AI_RMI_PACTA for Banks Free dataset_EO_2022Q4.xlsx" sheet_abcd: "Company Indicators - PACTA Comp"
filename_scenario_tms
{.yaml} is the filename of the file that contains production based scenario data. The file specified by filename_scenario_tms
{.yaml} must exist in the directory specified by the dir_input
{.yaml} parameter. It must be a single string/character value, and it must refer to a valid, accessible, local file. As an example:
filename_scenario_tms: "scenarios_2022_p4b.csv
filename_scenario_sda
{.yaml} is the filename of the file that contains emission intensity based scenario data. The file specified by filename_scenario_sda
{.yaml} must exist in the directory specified by the dir_input
{.yaml} parameter. It must be a single string/character value, and it must refer to a valid, accessible, local file. As an example:
filename_scenario_sda: "scenarios_2022_ei_p4b.csv"
filename_abcd
{.yaml} is the filename of the file in the directory defined by dir_input
{.yaml} that contains asset based company data, including production values and physical emission intensity values. It must be a single string/character value, and it must refer to a valid, accessible, local file. As an example:
filename_abcd: "2023-02-17_AI_RMI_PACTA for Banks Free dataset_EO_2022Q4.xlsx"
sheet_abcd
{.yaml} is the name of the sheet that contains asset based company data in the file defined by filename_abcd
{.yaml} and stored in the directory defined by dir_input
{.yaml}. It must be a single string/character value, and it must refer to a valid, accessible, sheet name in the appropriate file. As an example:
sheet_abcd: "Company Indicators - PACTA Comp"
A full example project_parameters
{.yaml} section might look like:
project_parameters: scenario_source: "weo_2022" scenario_select: "nze_2050" region_select: "global" start_year: 2022 time_frame: 5 by_group: "group_id"
scenario_source
{.yaml} is an identifier of the scenario source to be used. It must be a single string/character value, and it must refer to a valid, accessible, scenario source identifier contained in the scenario data file/s defined by filename_scenario_tms
{.yaml} and filename_scenario_sda
{.yaml}. Valid values typically look like "weo_2023"
{.yaml} or "geco_2022"
{.yaml}. As an example:
scenario_source: "weo_2022"
scenario_select
{.yaml} is an identifier of the scenario to be used. It must be a single string/character value, and it must refer to a valid, accessible, scenario identifier corresponding to the scenario_source
and contained in the scenario data file/s defined by filename_scenario_tms
{.yaml} and filename_scenario_sda
{.yaml}. Valid values typically look like "nze_2050"
{.yaml}, "aps"
{.yaml} or "steps"
{.yaml}. As an example:
scenario_select: "nze_2050"
region_select
{.yaml} is an identifier of the region to be used. It must be a single string/character value, and it must refer to a valid, accessible, region identifier contained in the r2dii.data::region_isos
{.r} dataset where it must be listed as a region available for the scenario_source
. Valid values typically look like "global"
{.yaml} or "advanced economies"
{.yaml}. As an example:
region_select: "global"
start_year
{.yaml} is the start year of the analysis. Normally, the start year should correspond with year of the publication of the scenario in use. It must be a single numeric value, and it must refer to a valid, accessible, year contained in the scenario data file/s defined by filename_scenario_tms
{.yaml} and filename_scenario_sda
{.yaml}. Valid values typically look like 2022
{.yaml} or 2023
{.yaml} (note that this value should not be wrapped in quotes). As an example:
start_year: 2022
time_frame
{.yaml} is the number of years (starting from the start_year
{.yaml}) that the analysis covers, defining the time frame. It must be a single numeric value, and it must define a valid, accessible, time frame covered by the scenario data file/s defined by filename_scenario_tms
{.yaml} and filename_scenario_sda
{.yaml}. Valid values typically look like 5
{.yaml} or 6
{.yaml} (note that this value should not be wrapped in quotes). As an example:
time_frame: 5
by_group
{.yaml} allows specifying the level of disaggregation to be used in the analysis. It determines the variable along which the loan books are grouped and thus the dimension by which to compare the PACTA calculations. For example, one may want to calculate system-wide results without disaggregation, using NULL
{.yaml} or one may want to analyse alignment along bank specific traits, such as "group_id"
{.yaml} or "bank_type"
{.yaml}. It can be NULL
{.yaml} or a character vector of length 1. If it is not NULL
{.yaml}, the indicated name must be a variable that is provided in the input loan books and it must be complete ("group_id"
is automatically created when reading in the loan books, so the user does not have to add it to the raw loan books). If the provided character string is "NULL"
{.yaml}, it will be treated as NULL
{.yaml}. As an example:
by_group: "group_id"
A full example sector_split
{.yaml} section might look like:
sector_split: apply_sector_split: TRUE filename_split_company_id: "split_company_ids.csv" filename_advanced_company_indicators: "2024-02-14_AI_2023Q4_RMI-Company-Indicators.xlsx" sheet_advanced_company_indicators: "Company Activities"
apply_sector_split
{.yaml} It must be a single logical value (either TRUE
{.yaml} or FALSE
{.yaml}). As an example:
apply_sector_split: TRUE
filename_split_company_id
{.yaml} is the filename of the CSV file that contains the split company ID data. The file specified by filename_split_company_id
{.yaml} must exist in the directory specified by the dir_input
{.yaml} parameter. It must be a single string/character value, and it must refer to a valid, accessible, local file. As an example:
filename_split_company_id: "split_company_ids.csv"
filename_advanced_company_indicators
{.yaml} is the filename of the XLSX file that contains the Advanced Company Indicators. The file specified by filename_advanced_company_indicators
{.yaml} must exist in the directory specified by the dir_input
{.yaml} parameter. It must be a single string/character value, and it must refer to a valid, accessible, local file. As an example:
filename_advanced_company_indicators: "2024-02-14_AI_2023Q4_RMI-Company-Indicators.xlsx"
sheet_advanced_company_indicators
{.yaml} is the name of the sheet that contains asset based company production data in the file defined by filename_advanced_company_indicators
{.yaml} and stored in the directory defined by dir_input
{.yaml}. It must be a single string/character value, and it must refer to a valid, accessible, sheet name in the appropriate file. As an example:
sheet_advanced_company_indicators: "Company Activities"
A full example matching
{.yaml} section might look like:
matching: params_match_name: by_sector: TRUE min_score: 0.9 method: "jw" p: 0.1 overwrite: NULL join_id: NULL manual_sector_classification: use_manual_sector_classification: FALSE filename_manual_sector_classification: "manual_sector_classification.csv"
A full example params_match_name
{.yaml} section might look like:
params_match_name: by_sector: TRUE min_score: 0.9 method: "jw" p: 0.1 overwrite: NULL join_id: NULL
by_sector
{.yaml}. It must be a single logical value (either TRUE
{.yaml} or FALSE
{.yaml}). Further explanation of this argument can be found in the documentation for r2dii.match::match_name()
{.R}. As an example:
by_sector: TRUE
min_score
{.yaml} is a number between 0-1, to set the minimum score threshold. A score of 1 is a perfect match. It must be a single numeric value. Valid values typically look like 0.7
{.yaml} or 0.9
{.yaml} (note that this value should not be wrapped in quotes). Further explanation of this argument can be found in the documentation for r2dii.match::match_name()
{.R}. As an example:
min_score: 0.9
method
{.yaml} is the method for distance calculation. It must be a single string/character value, and it must refer to a valid method identifier, one of "osa"
{.yaml}, "lv"
{.yaml}, "dl"
{.yaml}, "hamming"
{.yaml}, "lcs"
{.yaml}, "qgram"
{.yaml}, "cosine"
{.yaml}, "jaccard"
{.yaml}, "jw"
{.yaml}, "soundex"
{.yaml}. Further explanation of this argument can be found in the documentation for r2dii.match::match_name()
{.R} and stringdist::stringdist-metrics
{.R}. As an example:
method: "jw"
p
{.yaml} is the prefix factor for Jaro-Winkler distance. The valid range for p is 0 <= p <= 0.25. If p=0 (default), the Jaro-distance is returned. Applies only to method='jw'. It must be a single numeric value. Valid values typically look like 0.1
{.yaml} or 0.2
{.yaml} (note that this value should not be wrapped in quotes). Further explanation of this argument can be found in the documentation for r2dii.match::match_name()
{.R}. As an example:
p: 0.1
overwrite
{.yaml}. Further explanation of this argument can be found in the documentation for r2dii.match::match_name()
{.R}. As an example:
overwrite: NULL
join_id
{.yaml} is an optional parameter that allows defining by which variable to match the loans to the the companies in the abcd
. Its intended use case is join based on unambiguous identifiers, such as the lei
, where such data is available. It can be NULL
{.yaml} to use standard name matching when no common identifiers are given. Must be a join specification which is internally passed to dplyr::inner_join
. If it is an unnamed character/string vector, the values are assumed to refer to identically named join columns. If it is a named character vector, the names are used as the join columns in the loanbook
and the values are used as the join columns in the abcd
. Further explanation of this argument can be found in the documentation for r2dii.match::match_name()
{.R}. As an example:
join_id: c(lei_direct_loantaker = "lei")
A full example manual_sector_classification
{.yaml} section might look like:
manual_sector_classification: use_manual_sector_classification: FALSE filename_manual_sector_classification: "manual_sector_classification.csv"
use_manual_sector_classification
{.yaml} determines if the matching should use an internally provided sector classification system or if it should use one provided by the user instead. Internal sector classification systems are given in r2dii.data::sector_classifications
- see also additional documentation in r2dii.data
. The function will automatically attempt to use one of the sector classification systems, based on the inputs in the raw loan book files. If an externally prepared sector classification system is to be used, for example because the loans are classified using a system that is not provided in r2dii.data out of the box, the data must be prepared following the same structure as found in r2dii.data::sector_classifications
. It must be a single logical value (either TRUE
{.yaml} or FALSE
{.yaml}). As an example:
use_manual_sector_classification: FALSE
filename_manual_sector_classification
{.yaml} is the filename of the CSV that contains the manual sector classification data. The file specified by filename_manual_sector_classification
{.yaml} must exist in the directory specified by the dir_input
{.yaml} parameter. It must be a single string/character value, and it must refer to a valid, accessible, local file. As an example:
filename_manual_sector_classification: "manual_sector_classification.csv"
A full example match_prioritize
{.yaml} section might look like:
match_prioritize: priority: NULL
priority
{.yaml} indicates the level of matching that should be prioritized when a loan can be matched at multiple levels. It must be a single string/character value or NULL
{.yaml}, and it must refer to a valid, accessible, local file. Further explanation of this argument can be found in the documentation for r2dii.match::priortize()
{.R}. As an example:
priority: NULL
A full example prepare_abcd
{.yaml} section might look like:
prepare_abcd: remove_inactive_companies: TRUE
remove_inactive_companies
{.yaml} determines if inactive companies should be removed from the abcd dataset or not. "Companies" here refers to company-sector combinations and "inactive" characterizes such company-sector combinations that are inactive at the end of the time frame analysed. When focusing forward looking analysis on exposures in the end year, such inactive companies may not produce meaningful results. It must be a single logical value (either TRUE
{.yaml} or FALSE
{.yaml}). As an example:
remove_inactive_companies: TRUE
A full example match_success_rate
{.yaml} section might look like:
match_success_rate: plot_width: 12 plot_height: 8 plot_units: "in" plot_resolution: 300
plot_width
{.yaml} is the desired width of the XXX output plot in units defined by plot_units
{.yaml}. It must be a single numeric value. Valid values typically look like 10
{.yaml} or 12
{.yaml} (note that this value should not be wrapped in quotes). As an example:
plot_width: 12
plot_height
{.yaml} is the desired height of the XXX output plot in units defined by plot_units
{.yaml}. It must be a single numeric value. Valid values typically look like 6
{.yaml} or 8
{.yaml} (note that this value should not be wrapped in quotes). As an example:
plot_height: 8
plot_units
{.yaml} is the desired units to express the dimensions of the XXX output plot in plot_width
{.yaml} and plot_height
{.yaml}. It must be a single string/character value, and it must refer to a valid unit identifier. Valid values typically look like "in"
{.yaml} or "px"
{.yaml}. As an example:
plot_units: "in"
plot_resolution
{.yaml} is the desired resolution of the XXX output plot in dpi. It must be a single numeric value. Valid values typically look like 72
{.yaml} or 300
{.yaml} (note that this value should not be wrapped in quotes). As an example:
plot_resolution: 300
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.