data_measure_info: Makes a measurement metadata file

View source: R/data_measure_info.R

data_measure_infoR Documentation

Makes a measurement metadata file

Description

Make a measure_info.json file, or add measure entries to an existing one.

Usage

data_measure_info(path, ..., info = list(), references = list(),
  strict = FALSE, include_empty = TRUE, overwrite_entry = FALSE,
  render = NULL, overwrite = FALSE, write = TRUE, verbose = TRUE,
  open_after = interactive())

Arguments

path

Path to the measure_info.json file, existing or to be created.

...

Lists containing individual measure items. See the Measure Entries section.

info

A list containing measurement information to be added.

references

A list containing citation entries. See the Reference Entries section.

strict

Logical; if TRUE, will only allow recognized entries and values.

include_empty

Logical; if FALSE, will omit entries that have not been provided.

overwrite_entry

Logical; if TRUE, will replace rather than add to an existing entry.

render

Path to save a version of path to, with dynamic entries expanded. See the Dynamic Entries section.

overwrite

Logical; if TRUE, will overwrite rather than add to an existing path.

write

Logical; if FALSE, will not write the build or rendered measure info.

verbose

Logical; if FALSE, will not display status messages.

open_after

Logical; if FALSE, will not open the measure file after writing/updating.

Value

An invisible list containing measurement metadata (the rendered version if made).

Measure Entries

Measure entries are named by the full variable name with any of these entries (if strict):

  • measure: Name of the measure.

  • full_name: Full name of the measure, which is also the name of the entry.

  • short_name: Shortest possible display name.

  • long_name: Longer display name.

  • category: Arbitrary category for the measure.

  • short_description: Shortest possible description.

  • long_description: Complete description. Either description can include TeX-style equations, enclosed in escaped square brackets (e.g., "The equation \\[a_{i} = b^\\frac{c}{d}\\] was used."; or $...$, \\(...\\), or \\begin{math}...\\end{math}). The final enclosing symbol must be followed by a space or the end of the string. These are pre-render to MathML with katex_mathml.

  • statement: String with dynamic references to entity features (e.g., "measure value = {value}"). References can include:

    • value: Value of a currently displaying variable at a current time.

    • region_name: Alias of features.name.

    • features.<entry>: An entity feature, coming from entity_info.json or GeoJSON properties. All entities have at least name and id entries (e.g., "{features.id}").

    • variables.<entry>: A variable feature such as name which is the same as full_name (e.g., "{variables.name}").

    • data.<variable>: The value of another variable at a current time (e.g., "{data.variable_a}").

  • measure_type: Type of the measure's value. Recognized types are displayed in a special way:

    • year or integer show as entered (usually as whole numbers). Other numeric types are rounded to show a set number of digits.

    • percent shows as {value}%.

    • minutes shows as {value} minutes.

    • dollar shows as ${value}.

    • internet speed shows as {value} Mbps.

  • unit: Prefix or suffix associated with the measure's type, such as % for percent, or Mbps for rate.

  • sources: A list or list of list containing source information, including any of these entries:

    • name: Name of the source (such as an organization name).

    • url: General URL of the source (such as an organization's website).

    • location: More specific description of the source (such as a the name of a particular data product).

    • location_url: More direct URL to the resource (such as a page listing data products).

    • date_accessed: Date of retrieval (arbitrary format).

  • citations: A vector of reference ids (the names of reference entries; e.g., c("ref1", "ref3")).

  • layer: A list specifying an output_map overlay:

    • source (required): A URL to a GeoJSON file, or a list with a url and time entry, where time conditions the display of the layer on the current selected time. Alternative to a list that specifies time, the URL can include a dynamic reference to time, if the time values correspond to a component of the URL (e.g., "https://example.com/{time}/points.geojson").

    • filter: A list or list of lists specifying how the elements of the layer should be filtered for this variable:

      • feature: Name of the layer's property to filter on.

      • operator: Operator to filter by (e.g., "=" or "!=").

      • value: Value to filter by.

  • categories: A named list of categories, with any of the other measure entries, or a default entry giving a default category name. See the Dynamic Entries section.

  • variants: A named list of variants, with any of the other measure entries, or a default entry giving a default variant name. See the Dynamic Entries section.

Dynamic Entries

You may have several closely related variables in a dataset, which share sections of metadata, or have formulaic differences. In cases like this, the categories and/or variants entries can be used along with dynamic notation to construct multiple entries from a single template.

Though functionally the same, categories might include broken-out subsets of some total (such as race groups, as categories of a total population), whereas variants may be different transformations of the same variable (such as raw counts versus percentages).

In dynamic entries, {category} or {variant} refers to entries in the categories or variants lists. By default, these are replaced with the name of each entries in those lists (e.g., "variable_{category}" where categories = "a" would become "variable_a"). A default entry would change this behavior (e.g., with categories = list(a = list(default = "b") that would become "variable_b"). Adding .name would force the original behavior (e.g., "variable_{category.name}" would be "variable_a"). A name of "blank" is treated as an empty string.

When notation appears in a measure info entry, they will first default to a matching name in the categories or variants list; for example, short_name in list(short_name = "variable {category}") with categories = list(a = list(short_name = "(category a)")) would become "variable (category a)". To force this behavior, the entry name can be included in the notation (e.g., "{category.short_name}" would be "variable (category a)" in any entry).

Only string entries are processed dynamically – any list-like entries (such as source, citations, or layer) appearing in categories or variants entries will fully replace the base entry.

Dynamic entries can be kept dynamic when passed to a data site, but can be rendered for other uses, where the rendered version will have each dynamic entry replaced with all unique combinations of categories and variants entries, assuming both are used in the dynamic entry's name (e.g., "variable_{category}_{variant}"). See Examples.

Reference Entries

Reference entries can be included in a _references entry, and should have names corresponding to those included in any of the measures' citation entries. These can include any of these entries:

  • id: The reference id, same as the entry name.

  • author: A list or list of lists specifying one or more authors. These can include entries for given and family names.

  • year: Year of the publication.

  • title: Title of the publication.

  • journal: Journal in which the publication appears.

  • volume: Volume number of the journal.

  • page: Page number of the journal.

  • doi: Digital Object Identifier, from which a link is made (https://doi.org/{doi}).

  • version: Version number of software.

  • url: Link to the publication, alternative to a DOI.

Examples

path <- tempfile()

# make an initial file
data_measure_info(path, "measure name" = list(
  measure = "measure name",
  full_name = "prefix:measure name",
  short_description = "A measure.",
  statement = "This entity has {value} measure units."
), verbose = FALSE)

# add another measure to that
measure_info <- data_measure_info(path, "measure two" = list(
  measure = "measure two",
  full_name = "prefix:measure two",
  short_description = "Another measure.",
  statement = "This entity has {value} measure units."
), verbose = FALSE)
names(measure_info)

# add a dynamic measure, and make a rendered version
measure_info_rendered <- data_measure_info(
  path,
  "measure {category} {variant.name}" = list(
    measure = "measure {category}",
    full_name = "{variant}:measure {category}",
    short_description = "Another measure ({category}; {variant}).",
    statement = "This entity has {value} {category} {variant}s.",
    categories = c("a", "b"),
    variants = list(u1 = list(default = "U1"), u2 = list(default = "U2"))
  ),
  render = TRUE, verbose = FALSE
)
names(measure_info_rendered)
measure_info_rendered[["measure a u1"]]$statement

uva-bi-sdad/community documentation built on Oct. 12, 2023, 1:18 p.m.