coded.data: Functions for coded data

View source: R/coding.R

coded.dataR Documentation

Functions for coded data

Description

These functions facilitate the use of coded data in response-surface analysis.

Usage

coded.data(data, ..., formulas = list(...), block = "block")
as.coded.data(data, ..., formulas = list(...), block = "block")
decode.data(data)
recode.data(data, ..., formulas = list(...))

val2code(X, codings)
code2val(X, codings)

## S3 method for class 'coded.data'
print(x, ..., decode = TRUE)

### --- Methods for managing coded data ---
is.coded.data(x)

## S3 method for class 'coded.data'
x[...]

codings(object)
## S3 method for class 'coded.data'
codings(object)
codings(object) <- value

## S3 replacement method for class 'coded.data'
names(x) <- value

## Generic method for true variable names (i.e. decoded names)
truenames(x)
## S3 method for class 'coded.data'
truenames(x)
## Generic replacement method for truenames
truenames(x) <- value
## S3 replacement method for class 'coded.data'
truenames(x) <- value

Arguments

data

A data.frame

formulas

List of coding formulas; see details

block

Name(s) of blocking variable(s). It is pmatched (case insensitively) with names in data to identify blocking factorss

X

A vector, matrix, or data.frame to be coded or decoded.

codings

A list of formulas; see Details

decode

Logical. If TRUE, the decoded values are displayed; if FALSE, the codings are displayed.

object

A coded.data object

x

A coded.data object

value

Replacement value for <- methods

...

In coded.data, as.coded.data, and recode.data, ... allows specifying formulas as arguments rather than as a list. In other functions, ... is passed to the parent methods.

Details

Typically, coding formulas are of the form x ~ (var - center) / halfwd where x and var are variable names, and center and halfwd are numbers. The left-hand side gives the name of the coded variable, and the right-hand side should be a linear expression in the uncoded variable (linearity is not explicitly checked, but nonlinear expressions will not decode correctly.) If coded.data is called without formulas, automatic codings are created (along with a warning message). Automatic codings are based on transforming all non-block variables having five or fewer unique values to the interval [-1,1]. If no formulas are provided in as.coded.data, default coding formulas like those for cube are created all numeric variables with mean zero – again with a warning message.

An S3 print method is provided for the coded.data class; it displays the data.frame in either coded or decoded form, along with the coding formulas. Some users may prefer print.data.frame or as.data.frame in lieu of print with ‘⁠decode=FALSE⁠’; they produce the same output without displaying the coding formulas.

Use coded.data to convert a data.frame in which the variables are on their original scales. The variables named in the formulas are coded and replaced with their coded versions (and also renamed).

In contrast, as.coded.data does not modify any of the data; it assumes the variables are already coded, and the coding information is simply added. In addition, if data is already a coded.data object from a pre-1.41 version of rsm, it is converted to be compatible with new capabilities such as djoin (no formulas argument is needed in this case). Any blocking factors should be specified in the blocks argument.

decode.data converts a dataset of class coded.data and returns a data.frame containing the original variables.

recode.data is used to convert a coded.data object to new codings. Important: this changes the coded values to match the new coding formulas. If you want to keep the coded values the same, but change the levels they represent, use ‘⁠codings(object) <- \dots⁠’ or dupe.

code2val converts coded values to the original scale using the codings provided, and returns an object of the same class as X. val2code converts the other direction. When using these functions, it is essential that the names (or column names in the case of matrices) match those of the corresponding coded or uncoded variables.

codings is a generic function for accessing codings. It returns the list of coding formulas from a coded.data object. One may use an expression like ‘⁠codings(object) <- list(\dots)⁠’ to change the codings (without changing the coded values themselves). See also codings.rsm.

is.coded.data(x) returns TRUE if x inherits from coded.data, and FALSE otherwise.

The extraction function x[...] and the naming functions names<-, truenames, and truenames<- are provided to preserve the integrity of codings. For example, if x[, 1:3] excludes any coded columns, their coding formulas are also excluded. If all coded columns are excluded, the return value is unclassed from coded.data. When variable names are changed using names(x) <- ..., the coding formulas are updated accordingly. The truenames function returns the names of the variables in the decoded dataset. We can change the decoded names using truenames(x) <- ..., and the coding formulas are updated. Note that truenames and truenames<- work the same as names and names<- for unencoded variables in the object.

Another convenient way to copy and change the coding formulas a coded dataset (and optionally re-randomize it) is to use the dupe function with a coding argument.

When a design is created in another package, some of the variables may be factors, in which case they are converted using as.numeric (values of 1, 2, ...). These levels may be regarded as a yet different coding of the variables, and so it may take two steps to get it in the desired form: one to convert the supplied levels to the desired range (often -1 to 1), and the other to replace the coding formulas to correspond to the real values of the variables to be used. See the examples.

Value

coded.data, as.coded.data, and recode.data return an object of class coded.data, which inherits from data.frame. A coded.data object is stored in coded form, and its names attribute contains the coded names, where they apply. Thus, when fitting models in rsm or lm with coded data as the data argument, the model formula should be given in terms of the coded variables.

Note

Starting with rsm version 2.00, the coded.data class involves additional attributes to serve broader needs in design-generation. Because of this, old coded.data objects may need to be updated using as.coded.data if they are to be used with the newer functions such as djoin.

Author(s)

Russell V. Lenth

References

Lenth RV (2009). “Response-Surface Methods in R, Using rsm”, Journal of Statistical Software, 32(7), 1–17. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v032.i07")}

See Also

data.frame, djoin, dupe, rsm

Examples

library(rsm)

### Existing dataset with variables on actual scale
CR <- coded.data (ChemReact, x1 ~ (Time - 85)/5, x2 ~ (Temp - 175)/5)
CR                            # same as print(CR, decode = TRUE)
print(CR, decode = FALSE)     # similar to as.data.frame(CR)
code2val (c(x1=.5, x2=-1), codings = codings(CR))

### Existing dataset, already in coded form
CO <- as.coded.data(codata, x1 ~ (Ethanol - 0.2)/0.1, x2 ~ A.F.ratio - 15)
truenames(CO)
names(CO)

# revert x2 to an uncoded variable
codings(CO)[2] <- NULL
truenames(CO)

### Import a design that is coded in a different way

if (require(conf.design)) { # ----- This example requires conf.design -----

# First, generate a 3^3 in blocks and import it via coded.data
    des3 <- coded.data(conf.design(p=3, G=c(1,1,2)))
    # NOTE: This returns a warning message but does the right thing --
    # It generates these names and coding formulas automatically:
    #   x1 ~ (T1 - 2)/1
    #   x2 ~ (T2 - 2)/1
    #   x3 ~ (T3 - 2)/1
# Now randomize and change the codings and variable names for the real situation:
    mydes <- dupe(des3, coding = c(x1 ~ (Dose - 20)/5,  x2 ~ (Conc - 40)/10,  
                                x3 ~ (Time - 60)/15))
                                
} # ----- end of example requiring package conf.design -----


rvlenth/rsm documentation built on Sept. 25, 2023, 6 a.m.