map_codelist: Maps and replaces one dimension of a fact dataset using a...

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/map_codelist.R

Description

This function maps one dimension (i.e. column with codes) of a fact dataset to a target code list using a dataset of mappings between code lists. In other words, it makes the correspondance between two code lists for a given dimension, and replaces for this dimension in the fact dataset the old codes by the new codes available in the dataset of mappings between code lists.

Usage

1
map_codelist(df_input, df_mapping, dimension_to_map,keep_src_code)

Arguments

df_input

a data.frame of fact

df_mapping

a data.frame of code list mapping

dimension_to_map

the name (string) of the dimension to map.

keep_src_code

boolean keep source coding system column? TRUE will conserve in the output dataset both source and target coding systems columns, FALSE will conserve only target coding system (i.e. mapped). Default is FALSE

Details

The data frames of fact and code list mapping must be properly structured. The data.frame of mapping must have the 2 following columns:

Some codes might not be mapped, because no correspondance exists between the source code(s) and the target code(s). In the output dataset of the function map_codelist, these unmapped codes are set to "UNK". If keep_src_code is set to FALSE, the source coding system column will be dropped and the target coding system column will be named out dimension_to_map. If keep_src_code is set to TRUE, the source coding system column will be kept. In that case, the source coding system column will conserve its original name (dimension_to_map), and the target coding system column will be named "dimension_to_map"_mapping (e.g. gear_mapping)

Value

a list with two objects:

Author(s)

Paul Taconet, paul.taconet@ird.fr

See Also

Other process data: convert_units, create_calendar, create_grid, get_rfmos_datasets_level0, raise_datasets_by_dimension, raise_get_rf, raise_incomplete_dataset_to_total_dataset, rasterize_geo_timeseries, spatial_curation_downgrade_resolution, spatial_curation_intersect_areas, spatial_curation_reallocate_data, spatial_curation_upgrade_resolution

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# Connect to Tuna atlas database
con<-db_connection_tunaatlas_world()

  # Reads IOTC nominal catch dataset (2017 release)
  iotc_nominal_catch<-extract_dataset(con,list_metadata_datasets(con,identifier="indian_ocean_nominal_catch_1950_01_01_2015_01_01_tunaatlasIOTC_2017_level0"))
  head(iotc_nominal_catch)
  
  # Read a mapping between code lists (in this case, mapping between codes for fishing gears used by the tuna RFMOs and the International Standard Statistical Classification of Fishing Gear)
  df_mapping<-extract_dataset(con,list_metadata_datasets(con,identifier="codelist_mapping_gear_iotc_isscfg_revision_1")) 
  head(df_mapping)
 
  # Map code lists. Output is a list with two elements (see section "return"). Default conserves only target coding system in the output dataset. Set keep_src_code=TRUE to conserve both source and target coding systems in the output dataset.
  df_mapped<-map_codelist(iotc_nominal_catch,df_mapping,"gear",FALSE)
  
  # Get the dataframe mapped: dimension "gear" mapped to ISSCFG. The column "gear" has its values changed compared to the ones before the execution of the function. The codes have been mapped following the dimensions "gear" and "source_authority", since the dataset of mappings between code lists had both dimensions.
  df_mapped_df<-df_mapped$df
  head(df_mapped_df) 
  
  # Get information regarding the data that were not mapped.
  df_mapped$stats
 

ptaconet/rtunaatlas documentation built on Sept. 21, 2021, 10:43 p.m.