Syntax

yspec uses standard yaml syntax to state the data set column definitions.

NOTES

Instructions for including TeX in the yaml specification code are provided in a section below.

Organization

Save your data specification code in a file, typically with a .yaml file extension.

At the top of the file, include a block called SETUP__:; this is where the data set meta data is stored. For example

SETUP__:
  description: PKPD analysis data set
  use_internal_db: true
  projectnumber: FOO123
  sponsor: MetrumRG

See the details below for other files that can be included here.

Next, list each data set column in order, with the data column name starting in the first column and ending with a colon. For example:

WT:
  short: weight
  unit: kg
  range: [50, 150]

This specifies a "short" name for this column as well as a unit and a range. A complete listing is provided below.

You can see an fully worked example by running

ys_help$yaml()

See the ?ys_help help topic for more information.

Or, you can export a collection of package assets with this command

ys_help$export(output="assets")

See the [ys_help] topic for more information.

SETUP__ specification fields

Data column specification fields

Namespaces

Namespaces are alternative representation of certain column data fields

You can create namespaces by attaching a .<name> suffix to eligible fields.

For example, we can create a "tex" representation for unit like this

DV: 
  short: dependent variable
  unit: "microgram/mL"
  unit.tex: "$\\mu$g/mL"

Here, the unit: entry states the value for unit in the base namespace, the default data you get on load. Using unit.tex: introduces an entry for the tex namespace. After loading the spec, you can change to this namespace using

spec <- ys_load(...)
spec_tex <- ys_namespace(spec, "tex") 

Any time you attach a .<name> suffix to a field, yspec will interpret that as an attempt to enter namespace data. The user is responsible for creating and organizing namespaces and naming them. yspec will create the base namespace. Also, when rendering a data specification document, yspec will attempt to switch to the tex namespace if it exists. Beyond that, yspec is agnostic to the names of the namespaces you create.

As another example, we can have alternate short names depending on whether or not we are using that name to create axis titles for a plot

EGFR:
  short: estimated creatinine clearance
  short.plot: eGFR

or decode

SEX:
  values: [0, 1]
  decode: [male, female]
  decode.letter: [m, f]

Defaults

Examples

Continuous values

WT:
  about: [weight, kg]
  range: [5, 300]

This is equivalent to

WT:
  short: weight
  unit: kg
  range: [5,300]

Character data

RACE:
  values: [White, Black, Native American, Other]

Any other array input structure can be used. For example

RACE: 
  values:
    - White
    - Black
    - Native American
    - Other

By default, values are printed as comma-separated list. To get them to print in long format

RACE:
  values: [White, Black, Native American, Other]
  longvalues: true

Discrete data with decode

Method 1

SEX:
  values: {dude: 0, gal: 1}

Special handlers are available that add some flexibility to this value / decode specification.

The !value:decode handler allows you to put the value on the left and decode on the right

SEX: 
  values: !value:decode
    0 : dude
    1 : gal

The default behavior can be achieved with

SEX: 
  values: !value:decode
    dude: 0
    gal: 1

The handlers also allow associating multiple values with a single decode

To get multiple values with the same decode

STUDY:
  values: !decode:value
    phase 1 : [101, 102, 103]
    phase 2 : [201, 202, 203]
    phase 3 : [301, 302, 303]

Method 2

BQL:
  values: [0,1]
  decode: [not below quantitation limit, below quantitation limit]

Method 3 Really, it's the same as method 2, but easier to type and read when the decode gets really long

BQL:
  values [0, 1]
  decode:
    - not below the quantitation limit of 2 ng/ml
    - below the quantitation limit of 2 ng/ml

Look up column definition

Either fill in the lookup field or use the !look handler

CMT: 
  lookup: true
CMT: !look

You can also give the column name to import

HT: 
  lookup: HT_INCHES

In this example, there would be a column called HT_INCHES in the lookup file that would be imported under the name HT.

Include TeX in data specification document

Most define documents get rendered via xtable and the text gets processed by a sanitize function. yspec implements a custom sanitize function called ys_sanitize(), which is similar to xtable::sanitize, but whitelists some symbols so they do not get sanitized.

To protect TeX code from the sanitizer, first create a field in SETUP__ called glue with a map between a name and some corresponding TeX code. In the following example, we with to write $\mu$g/L, so we create a name called mugL and map it to $\\mu$g/L:

SETUP__:
  glue: {mugL: "$\\mu$g/L"}

Once the map is in place, we can write the data set column definition like this:

DV: 
  unit: "<<mugL>>"

When the table for the define document is rendered, first the sanitizer will run, but it won't find anything in the unit field for the DV column. Then yspec will call glue() and replace <<mugL>> with $\\mu%g/L.

Notice that we put all of the values in quotes; this is good practice to ensure that yaml will parse the value as a character data item when reading in the spec.

flags

The flags section in SETUP__: is available for you to name sets of columns in the work in spec. For example, the following code defines a flag called covariate and it names three columns (WT, AGE, and CRCL) to carry this tag

SETUP__:
  flags:
    covariate: [WT, AGE, CRCL]

When yspec loads a yaml file that contains flags, it will go into every column in the spec and add a logical flag in dots that indicates whether or not that column is a member of that covariate set. For this example, all columns in the spec will have dots$covariate set to FALSE except for WT, AGE, and CRCL where it will be set to TRUE.

The user can appear to this information when filtering the spec. Filtering like this will return a yspec object containing only WT, AGE, and CRCL.

ys_filter(spec, covariate)

Note that this flagging process will not overwrite a flag that the user already set in a specific column. In this example, AGE will not be flagged as a covariate, but WT and CRCL will.

SETUP__:
  flags:
    covariate: [WT, AGE, CRCL]
WT: 
  short: weight
AGE: 
  short: age
  dots: {covariate: false}
CRCL:
  short: creatinine clearance

It's recommended that flags are given in the SETUP__ information only, but the user can override as needed.



metrumresearchgroup/yspec documentation built on May 24, 2024, 12:48 a.m.