mudata: Create a mudata object

View source: R/mudata.R

mudataR Documentation

Create a mudata object

Description

Create a mudata object, which is a collection of five tables: data, locations, params, datasets, and columns. You are only required to provide the data table, which must contain columns "param" and "value", but will more typically contain columns "location", "param", "datetime" (or "date"), and "value". See ns_climate, kentvillegreenwood, alta_lake, long_lake, and second_lake_temp for examples of data in this format.

Usage

mudata(
  data,
  locations = NULL,
  params = NULL,
  datasets = NULL,
  columns = NULL,
  x_columns = NULL,
  ...,
  more_tbls = NULL,
  dataset_id = "default",
  location_id = "default",
  validate = TRUE
)

Arguments

data

A data.frame/tibble containing columns "param" and "value" (at least), but more typically columns "location", "param", "datetime" (or "date", depending on the type of data), and "value".

locations

The locations table, which is a data frame containing the columns (at least) "dataset", and "location". If omitted, it will be created automatically using all unique dataset/location combinations.

params

The params table, which is a data frame containing the columns (at least) "dataset", and "param". If omitted, it will be created automatically using all unique dataset/param combinations.

datasets

The datasets table, which is a data frame containing the column (at least) "dataset". If omitted, it will be generated automatically using all unique datasets.

columns

The columns table, which is a data frame containing the columns (at least) "dataset", "table", and "column". If omitted, it will be created automatically using all dataset/table/column combinations.

x_columns

A vector of column names from the data table that in combination with "dataset", "location", and "param" identify unique rows. These will typically be guessed using the column names between "param" and "value".

..., more_tbls

More tbls (as named arguments) to be included in the mudata object

dataset_id

The dataset to use if a "dataset" column is omitted.

location_id

The location if a "location" column is omitted.

validate

Pass FALSE to skip validation of input tables using validate_mudata.

Value

An object of class "mudata", which is a list with components data, locations, params, datasets, columns, and any other tables provided in more_tbls. All list components must be tbls.

References

Dunnington DW and Spooner IS (2018). "Using a linked table-based structure to encode self-describing multiparameter spatiotemporal data". FACETS. doi:10.1139/facets-2017-0026

Examples

# use the data table from kentvillegreenwood as a template
kg_data <- tbl_data(kentvillegreenwood)
# create mudata object using just the data table
mudata(kg_data)

# create a mudata object starting from a parameter-wide data frame
library(tidyr)
library(dplyr)

# gather columns and summarise replicates
datatable <- pocmaj %>%
  gather(Ca, Ti, V, key = "param", value = "param_value") %>%
  group_by(core, param, depth) %>%
  summarise(value = mean(param_value), sd = mean(param_value)) %>%
  rename(location = core)

# create mudata object
mudata(datatable)


paleolimbot/mudata documentation built on Oct. 3, 2023, 10:03 a.m.