glottoconvert: Convert a linguistic dataset into glottodata or glottosubdata

View source: R/glottoconvert.R

glottoconvertR Documentation

Convert a linguistic dataset into glottodata or glottosubdata

Description

This function is mainly intended for 'messy' datasets that are not in glottodata/glottosubdata structure.

Usage

glottoconvert(
  data,
  var = NULL,
  glottocodes = NULL,
  table = NULL,
  glottocolumn = NULL,
  glottosubcolumn = NULL,
  ref = NULL,
  page = NULL,
  remark = NULL,
  contributor = NULL,
  varnamecol = NULL
)

Arguments

data

A dataset that should be converted into glottodata/glottosubdata. This will generally be an excel file loaded with glottoget().

The dataset will be converted into glottodata if:

  • all data are stored in a single table, or

  • the dataset contains several tables of which one is called 'glottodata', or

  • a table argument is provided.

Otherwise, glottospace will attempt to convert the dataset into glottosubdata. This works if:

  • table names are glottocodes, and

  • an argument is provided to glottocodes, or the dataset contains a sample table from which glottocodes can be obtained.

var

Character string that distinguishes those columns which contain variable names.

glottocodes

Optional character vector of glottocodes. If no glottocodes are supplied, glottospace will search for them in the sample table.

table

In case dataset consists of multiple tables, indicate which table contains the data that should be converted.

glottocolumn

column name or column id with glottocodes (optional, provide if glottocodes are not stored in a column called 'glottocode')

glottosubcolumn

Column name or column id with glottosubcodes (optional, provide if glottosubcodes are not stored in a column called 'glottosubcode')

ref

Character string that distinguishes those columns which contain references.

page

Character string that distinguishes those columns which contain page numbers.

remark

Character string that distinguishes those columns which contain remarks.

contributor

Character string that distinguishes those columns which contain contributors.

varnamecol

In case the dataset contains a structure table, but the varnamecol is not called 'varname', its name should be specified.

Value

A glottodata or glottosubdata object (either a list or data.frame)

Examples

# Create a messy dataset:
glottodata <- glottoget("demodata")
glottodata <- cbind(glottodata, data.frame("redundant" = c(1:6)))

# In this messy dataset there's no way to determine which columns contain the relevant variables...
# Therefore we manually add a character string to distinguish the relevant columns:
colnames(glottodata)[2:3] <- paste0("var_", colnames(glottodata)[2:3] )

glottoconverted <- glottoconvert(glottodata, var = "var_")

SietzeN/glottospace documentation built on June 15, 2024, 10:45 p.m.