yspec uses standard yaml syntax to state the data set column definitions.
NOTES
true
and yes
by themselves will be rendered as TRUE
; use "yes"
if you
need that word by itself as a value for a fieldfalse
and no
by themselves will be returned as FALSE
; use "no"
if you
need that word by itself as a value for a fieldshort: "> QL"
not short: > QL
values: [".", C]
not values: [.,C]
address: "123 Main St."
not address: 123 Main St.
label: "line 1 \\n line 2"
not label: line 1 \n line2
Instructions for including TeX in the yaml specification code are provided in a section below.
Save your data specification code in a file, typically with a .yaml
file extension.
At the top of the file, include a block called SETUP__:
; this is where
the data set meta data is stored. For example
SETUP__: description: PKPD analysis data set use_internal_db: true projectnumber: FOO123 sponsor: MetrumRG
See the details below for other files that can be included here.
Next, list each data set column in order, with the data column name starting in the first column and ending with a colon. For example:
WT: short: weight unit: kg range: [50, 150]
This specifies a "short" name for this column as well as a unit and a range. A complete listing is provided below.
You can see an fully worked example by running
ys_help$yaml()
See the ?ys_help
help topic for more information.
Or, you can export a collection of package assets with this command
ys_help$export(output="assets")
See the [ys_help] topic for more information.
SETUP__
specification fieldsdescription
: <chr>
; a short, label-like description of the data set projectnumber
: <chr>
the project reference number; may be
incorporated into rendered define documents; when the project number is
given in the first yspec object in a project object, that project number
will be rendered in the project-wide define documentsponsor
: <chr>
the project sponsor; when the project sponsor is
given in the first yspec object in a project object, that project sponsor
name will be rendered in the project-wide define document data_path
: <chr>
; a path locating the data set associated with the specdata_stem
: <chr>
; the stem (no extension) for the data set associated
with the spec; usually the stem of the data file is the same as the stem of
the spec, but they can also be differentlookup_file
: <chr>
; a yaml array of other yaml files where
yspec will look for column lookup informationuse_internal_db
: <logical> (true/false)
; if true
, then yspec will
load the internal column lookup databaseimport
: <chr>
; give the name of a <file>
to import into the
current data spec; all
columns from <file>
are imported as is; additional columns may also be listed
with the normal syntax and these columns will appear after the imported
columnscharacter_last
: <logical>
; if true, automatically push all non-numeric
columns to the end of the data specification listcomment_col
: <chr>
; identify the column that is used to store
comments; the comment column will not be pushed to the back when
character_last
is trueglue
: <map>
; specify name/value pairs; in the yaml data specification,
use <<name>>
in the text and value
will glued into the text after
it has been sanitized; intended use is to allow LaTeX code to evade the
sanitizermax_nchar_label
: integer
; the maximum number of characters allowed in
the label
fieldmax_nchar_short
: integer
; the maximum number of characters allowed in
the short
fieldmax_nchar_col
: integer
; the maximum number of characters allowed in
the data set column nameflags
: <map>
; for each key, an array of column names where a logical
data item will be set in the dots
listshort: short-name
unit: numeric
range: [min-value, max-value]
values: [val1, val2, valn]
values: {decode1: val1, decode22: val2}
:
)decode: [decode1, decode2, decode3]
decode
from the values
specificationlongvalues: true
yaml
-formatted listcomment: just whatever you want to say
comment: >
say something
on multiple lines of
text
source: ADSL.xpt
about: [short-name, unit]
label
: a label for the column; the label must be 40 or fewer characters and
will get written into the define file as well as the data frame prior to
writing out to sas xport formatlong: a longer name to describe the column
dots
: dots
list isn't used by any rendering function in the yspec package, but might
be used by a custom rendering function axis
:short
will work for your axis title (as it is ... with no
modification), yspec will use that if no axis
field is usedtype
:numeric
, character
, or integer
numeric
make_factor
: if true
, then the column will be able to be converted to a
factor regardless of whether decode
is included or notlookup
:true
then the definition for the column is looked up in
the lookup_files
(specified in SETUP__:
)!look
handler to indicate lookupNamespaces are alternative representation of certain column data fields
unit
short
label
long
decode
comment
You can create namespaces by attaching a .<name>
suffix to eligible fields.
For example, we can create a "tex" representation for unit
like this
DV: short: dependent variable unit: "microgram/mL" unit.tex: "$\\mu$g/mL"
Here, the unit:
entry states the value for unit in the base
namespace, the
default data you get on load. Using unit.tex:
introduces an entry for the
tex
namespace. After loading the spec, you can change to this namespace using
spec <- ys_load(...) spec_tex <- ys_namespace(spec, "tex")
Any time you attach a .<name>
suffix to a field, yspec
will interpret that
as an attempt to enter namespace data. The user is responsible for creating
and organizing namespaces and naming them. yspec
will create the base
namespace. Also, when rendering a data specification document, yspec
will
attempt to switch to the tex
namespace if it exists. Beyond that, yspec
is
agnostic to the names of the namespaces you create.
As another example, we can have alternate short
names depending on whether
or not we are using that name to create axis titles for a plot
EGFR: short: estimated creatinine clearance short.plot: eGFR
or decode
SEX: values: [0, 1] decode: [male, female] decode.letter: [m, f]
type
is not given, then it will default to numeric
about
array provides a short name and unitrange
is given, the data is assumed to be continuousWT: about: [weight, kg] range: [5, 300]
This is equivalent to
WT: short: weight unit: kg range: [5,300]
values
indicates discrete dataRACE: values: [White, Black, Native American, Other]
Any other array input structure can be used. For example
RACE: values: - White - Black - Native American - Other
By default, values
are printed as comma-separated list. To
get them to print in long format
RACE: values: [White, Black, Native American, Other] longvalues: true
Method 1
:
that separates decode (on the left) and the
value (on the right).SEX: values: {dude: 0, gal: 1}
Special handlers are available that add some flexibility to this value / decode specification.
The !value:decode
handler allows you to put the value on the left and
decode on the right
SEX: values: !value:decode 0 : dude 1 : gal
The default behavior can be achieved with
SEX: values: !value:decode dude: 0 gal: 1
The handlers also allow associating multiple values with a single decode
To get multiple values with the same decode
STUDY: values: !decode:value phase 1 : [101, 102, 103] phase 2 : [201, 202, 203] phase 3 : [301, 302, 303]
Method 2
values
and deode
in brackets (array)BQL: values: [0,1] decode: [not below quantitation limit, below quantitation limit]
Method 3 Really, it's the same as method 2, but easier to type and read when the decode gets really long
BQL: values [0, 1] decode: - not below the quantitation limit of 2 ng/ml - below the quantitation limit of 2 ng/ml
Either fill in the lookup
field or use the !look
handler
CMT: lookup: true
CMT: !look
You can also give the column name to import
HT: lookup: HT_INCHES
In this example, there would be a column called HT_INCHES
in the
lookup file that would be imported under the name HT
.
Most define documents get rendered via xtable
and the text gets processed
by a sanitize function. yspec implements a custom sanitize function
called ys_sanitize()
, which is similar to xtable::sanitize
, but whitelists
some symbols so they do not get sanitized.
To protect TeX code from the sanitizer, first create a field in SETUP__
called glue
with a map between a name and some corresponding TeX code. In
the following example, we with to write $\mu$g/L, so we create a name
called mugL
and map it to $\\mu$g/L
:
SETUP__: glue: {mugL: "$\\mu$g/L"}
Once the map is in place, we can write the data set column definition like this:
DV: unit: "<<mugL>>"
When the table for the define document is rendered, first the sanitizer will
run, but it won't find anything in the unit
field for the DV
column. Then
yspec will call glue()
and replace <<mugL>>
with $\\mu%g/L
.
Notice that we put all of the values in quotes; this is good practice to ensure that yaml will parse the value as a character data item when reading in the spec.
flags
The flags
section in SETUP__:
is available for you to name sets of columns
in the work in spec. For example, the following code defines a flag called
covariate
and it names three columns (WT
, AGE
, and CRCL
) to carry
this tag
SETUP__: flags: covariate: [WT, AGE, CRCL]
When yspec loads a yaml file that contains flags
, it will go into every column
in the spec and add a logical flag in dots
that indicates whether or not that
column is a member of that covariate set. For this example, all columns in the
spec will have dots$covariate
set to FALSE
except for WT
, AGE
, and
CRCL
where it will be set to TRUE
.
The user can appear to this information when filtering the spec. Filtering like
this will return a yspec object containing only WT
, AGE
, and CRCL
.
ys_filter(spec, covariate)
Note that this flagging process will not overwrite a flag that the user already
set in a specific column. In this example, AGE
will not be flagged as a
covariate, but WT
and CRCL
will.
SETUP__: flags: covariate: [WT, AGE, CRCL] WT: short: weight AGE: short: age dots: {covariate: false} CRCL: short: creatinine clearance
It's recommended that flags
are given in the SETUP__
information only, but
the user can override as needed.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.