sdsdecoding: sdsanalysis decoding functions

sdsdecodingR Documentation

sdsanalysis decoding functions

Description

SDS traditionally provides a set of predefined values for each variable. That's not just convenience: It theoretically also allows for a high degree of comparability between different datasets. This predefined values/categories are encoded with a simple and minimalistic alphanumerical scheme. That's a technological rudiment both from the time when the systems that served SDS as an inspiration were created and when most stone tool analysis was made without a computer in reach.

The encoding has the big disadvantage that it's not immediately human readable. If you try to understand a SDS dataset you're forced to constantly look up new variables in the SDS publications. That makes it very difficult to get a fast overview.

sdsanalysis offers functions to quickly decode the cryptic codes in the SDS tables and replace them with human readable descriptions. This is implemented with hash tables to enable high-speed transformation even for datasets with thousands of artefacts. The hash tables are compiled from two reference tables for variables and variable values.

  • lookup_everything: Wizard function. Enter a SDS data.frame and receive a decoded version. This function employs the ones below and some more helpers to make the decoding process as simple as possible

  • lookup_vars: In: character vector with variable IDs (e.g. FB1_23, FB2_56) Out: character vector with short variable names (menge_rinde, dorsal_praep)

  • lookup_var_complete_names: In: character vector with short variable names (menge_rinde, dorsal_praep) Out: character vector with long variable names (e.g. Art der Dorsalflaechenpraeparation, Menge der Rinde und natuerlichen Sprungflaeche)

  • lookup_var_types: In: character vector with short variable names (menge_rinde, dorsal_praep) Out: character vector with variable data types (e.g. character, numeric)

  • apply_var_types: In: encoded variable vector (SDS data.frame column) + respective variable short name Out: encoded variable vector with corrected data type apply instead of lookup, because in this case the result of an other lookup is used to manipulate the input vector.

  • lookup_attrs: In: encoded variable vector (SDS data.frame column) + respective variable short name Out: decoded variable vector

  • lookup_attr_types: In: encoded variable vector (SDS data.frame column) + respective variable short name Out: character vector with semantic type (e.g. normal, unknown)

  • apply_attr_types: In: encoded variable vector (SDS data.frame column) + respective variable short name Out: encoded variable vector with the correct values set to NA based on the semantic type

  • lookup_IGerM_category: In: decoded IGerM vector Out: IGerM category or subcategory vector

Usage

lookup_everything(sds_df)

lookup_vars(var_ids)

lookup_var_complete_names(var_short_names)

lookup_var_types(var_short_names)

apply_var_types(var_data, var_short_name)

lookup_attrs(var_data, var_short_name)

lookup_attr_types(var_data, var_short_name)

apply_attr_types(var_data, var_short_name)

lookup_IGerM_category(igerm_data, subcategory = FALSE)

Arguments

sds_df

Dataframe. Data.frame in SDS standard format.

var_ids

Character Vector. Variable IDs.

var_short_names

Character Vector. Variable short names.

var_data

Vector. Variable data.

var_short_name

Character. Variable short name.

igerm_data

Character vector. IGerM character codes in data.

subcategory

Boolean. Should the function return IGerM subcategories instead of categories?


nevrome/sdsanalysis documentation built on March 19, 2024, 11:48 p.m.