sdsdecoding: sdsanalysis decoding functions
In nevrome/sdsanalysis: Interface To Work With SDS Stone Tool Analysis Data

sdsdecoding

R Documentation

sdsanalysis decoding functions

Description

SDS traditionally provides a set of predefined values for each variable. That's not just convenience: It theoretically also allows for a high degree of comparability between different datasets. This predefined values/categories are encoded with a simple and minimalistic alphanumerical scheme. That's a technological rudiment both from the time when the systems that served SDS as an inspiration were created and when most stone tool analysis was made without a computer in reach.

The encoding has the big disadvantage that it's not immediately human readable. If you try to understand a SDS dataset you're forced to constantly look up new variables in the SDS publications. That makes it very difficult to get a fast overview.

sdsanalysis offers functions to quickly decode the cryptic codes in the SDS tables and replace them with human readable descriptions. This is implemented with hash tables to enable high-speed transformation even for datasets with thousands of artefacts. The hash tables are compiled from two reference tables for variables and variable values.

lookup_everything: Wizard function. Enter a SDS data.frame and receive a decoded version. This function employs the ones below and some more helpers to make the decoding process as simple as possible
lookup_vars: In: character vector with variable IDs (e.g. FB1_23, FB2_56) Out: character vector with short variable names (menge_rinde, dorsal_praep)
lookup_var_complete_names: In: character vector with short variable names (menge_rinde, dorsal_praep) Out: character vector with long variable names (e.g. Art der Dorsalflaechenpraeparation, Menge der Rinde und natuerlichen Sprungflaeche)
lookup_var_types: In: character vector with short variable names (menge_rinde, dorsal_praep) Out: character vector with variable data types (e.g. character, numeric)
apply_var_types: In: encoded variable vector (SDS data.frame column) + respective variable short name Out: encoded variable vector with corrected data type apply instead of lookup, because in this case the result of an other lookup is used to manipulate the input vector.
lookup_attrs: In: encoded variable vector (SDS data.frame column) + respective variable short name Out: decoded variable vector
lookup_attr_types: In: encoded variable vector (SDS data.frame column) + respective variable short name Out: character vector with semantic type (e.g. normal, unknown)
apply_attr_types: In: encoded variable vector (SDS data.frame column) + respective variable short name Out: encoded variable vector with the correct values set to NA based on the semantic type
lookup_IGerM_category: In: decoded IGerM vector Out: IGerM category or subcategory vector

Usage

lookup_everything(sds_df)

lookup_vars(var_ids)

lookup_var_complete_names(var_short_names)

lookup_var_types(var_short_names)

apply_var_types(var_data, var_short_name)

lookup_attrs(var_data, var_short_name)

lookup_attr_types(var_data, var_short_name)

apply_attr_types(var_data, var_short_name)

lookup_IGerM_category(igerm_data, subcategory = FALSE)

Arguments

`sds_df`	Dataframe. Data.frame in SDS standard format.
`var_ids`	Character Vector. Variable IDs.
`var_short_names`	Character Vector. Variable short names.
`var_data`	Vector. Variable data.
`var_short_name`	Character. Variable short name.
`igerm_data`	Character vector. IGerM character codes in data.
`subcategory`	Boolean. Should the function return IGerM subcategories instead of categories?