sdsdecoding | R Documentation |
SDS traditionally provides a set of predefined values for each variable. That's not just convenience: It theoretically also allows for a high degree of comparability between different datasets. This predefined values/categories are encoded with a simple and minimalistic alphanumerical scheme. That's a technological rudiment both from the time when the systems that served SDS as an inspiration were created and when most stone tool analysis was made without a computer in reach.
The encoding has the big disadvantage that it's not immediately human readable. If you try to understand a SDS dataset you're forced to constantly look up new variables in the SDS publications. That makes it very difficult to get a fast overview.
sdsanalysis offers functions to quickly decode the cryptic codes in the SDS tables and replace them with human readable descriptions. This is implemented with hash tables to enable high-speed transformation even for datasets with thousands of artefacts. The hash tables are compiled from two reference tables for variables and variable values.
lookup_everything
: Wizard function. Enter a
SDS data.frame
and receive a decoded version. This function employs the ones below and some more
helpers to make the decoding process as simple as possible
lookup_vars
:
In: character vector with variable IDs (e.g. FB1_23, FB2_56)
Out: character vector with short variable names (menge_rinde, dorsal_praep)
lookup_var_complete_names
:
In: character vector with short variable names (menge_rinde, dorsal_praep)
Out: character vector with long variable names
(e.g. Art der Dorsalflaechenpraeparation, Menge der Rinde und natuerlichen Sprungflaeche)
lookup_var_types
:
In: character vector with short variable names (menge_rinde, dorsal_praep)
Out: character vector with variable data types (e.g. character, numeric)
apply_var_types
:
In: encoded variable vector (SDS data.frame column) + respective variable short name
Out: encoded variable vector with corrected data type
apply instead of lookup, because in this case the result of an other lookup is
used to manipulate the input vector.
lookup_attrs
:
In: encoded variable vector (SDS data.frame column) + respective variable short name
Out: decoded variable vector
lookup_attr_types
:
In: encoded variable vector (SDS data.frame column) + respective variable short name
Out: character vector with semantic type (e.g. normal, unknown)
apply_attr_types
:
In: encoded variable vector (SDS data.frame column) + respective variable short name
Out: encoded variable vector with the correct values set to NA based on the semantic
type
lookup_IGerM_category
:
In: decoded IGerM vector
Out: IGerM category or subcategory vector
lookup_everything(sds_df)
lookup_vars(var_ids)
lookup_var_complete_names(var_short_names)
lookup_var_types(var_short_names)
apply_var_types(var_data, var_short_name)
lookup_attrs(var_data, var_short_name)
lookup_attr_types(var_data, var_short_name)
apply_attr_types(var_data, var_short_name)
lookup_IGerM_category(igerm_data, subcategory = FALSE)
sds_df |
Dataframe. Data.frame in SDS standard format. |
var_ids |
Character Vector. Variable IDs. |
var_short_names |
Character Vector. Variable short names. |
var_data |
Vector. Variable data. |
var_short_name |
Character. Variable short name. |
igerm_data |
Character vector. IGerM character codes in data. |
subcategory |
Boolean. Should the function return IGerM subcategories instead of categories? |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.