coarsened: Coarsened Factors

View source: R/coarsened.R

coarsenedR Documentation

Coarsened Factors

Description

A coarsened factor is an extended version of a factor or ordered factor whose elements may be fully observed, partially observed or missing. The partially-observed and missing states are represented by extra levels which are interpreted as groupings of the fully observed states. Coarsened factors are specifically designed for modeling with the cvam package.

Usage

coarsened(obj, levelsList = list(), warnIfCoarsened = TRUE)

is.coarsened(x)

## S3 method for class 'coarsened'
print(x, quote = FALSE, max.levels = NULL,
   width = getOption("width"), ...)

## S3 method for class 'coarsened'
droplevels(x, ...)

## S3 method for class 'coarsened'
relevel(x, ...)

## S3 method for class 'coarsened'
reorder(x, ...)

## S3 method for class 'coarsened'
rep(x, ...)

## S3 method for class 'coarsened'
x[...]

## S3 method for class 'coarsened'
x[[...]]

## S3 replacement method for class 'coarsened'
x[...] <- value

## S3 replacement method for class 'coarsened'
x[[...]] <- value

Arguments

obj

a factor or ordered factor to be converted to a coarsened factor

levelsList

a named list that defines the groupings of levels(obj) to indicate states of partial knowledge

warnIfCoarsened

if TRUE, a warning is issued if obj is already a coarsened factor

x

a coarsened factor or other object

quote

logical, indicating whether or not strings should be printed with surrounding quotes

max.levels

integer, indicating how many base levels and coarse levels should be printed for a coarsened factor; if 0, no extra base levels or coarse levels lines will be printed. The default, NULL, entails choosing max.levels such that the base levels and coarse levels each print on one line of width width

width

only used when max.levels is NULL; see above

...

additional arguments passed to or from other methods

value

character: a set of levels for replacement

Details

A coarsened factor, which inherits from class "factor" or c("ordered", "factor"), has two types of levels: base levels, which represent states of complete knowledge, and coarse levels, which represent states of incomplete knowledge. Each coarse level maps to two or more base levels. The mapping is defined by the argument levelsList.

For example, consider a factor whose levels are c("red", "notRed", "green", "yellow"), where "notRed" denotes an observation that is either "green" or "yellow". When the factor is converted to a coarsened factor, c("red", "green", "yellow") becomes the baseLevels, and "notRed" becomes an element of coarseLevels. To produce this result, the argument levelsList should have a component named "notRed", whose value is c("green", "yellow").

The last coarse level is NA, denoting an observation that could belong to any of the base levels. The NA coarse level is created automatically. Calling coarsened with an empty levelsList (the default) produces a coarsened factor with NA as its only coarse level.

If the main argument to coarsened is already a coarsened factor, then a warning is issued (if warnIfCoarsened is TRUE) and the coarsened factor is returned unchanged.

The generic functions droplevels, relevel, and reorder should not be applied to coarsened factors; the S3 methods droplevels.coarsened, relevel.coarsened, and reorder.coarsened will prevent this from happening.

rep.coarsened is a method for the generic function rep that ensures the special attributes of a coarsened factor are preserved.

Extraction and replacement methods `[` and `[[` are also provided to preserve the special attributes of coarsened factors.

Value

coarsened returns a coarsened factor.

is.coarsened returns TRUE if x is a coarsened factor and FALSE otherwise.

Note

Coarsened factors were designed for use by the modeling function cvam, which treats base levels and coarse levels differently. Other statistical modeling routines, such as lm, may not handle them appropriately. Functions outside of the cvam package will treat coarse levels (including NA) the same as base levels, producing results that are difficult to interpret or nonsensical, especially if the base levels are ordered.

The behavior of coarsened with levelsList = list() is similar to that of addNA, which converts the missing values in a factor to non-missing observations with value NA and adds NA to the levels. The result of addNA, however, is an ordinary factor or ordered factor which has no mechanism to inform other functions that NA has special meaning.

The function is.na should not be applied to a coarsened factor; use is.naCoarsened instead.

Because base levels and coarse levels should be handled differently, functions from base R that manipulate the levels of a factor, including relevel, reorder, droplevels, and the replacement version of levels should not be used with coarsened factors. Supplying a coarsened factor to any of these functions will produce an error.

Author(s)

Joe Schafer Joseph.L.Schafer@census.gov

References

For more information, refer to the package vignette Understanding Coarsened Factors in cvam.

See Also

cvam, is.naCoarsened, baseLevels, dropCoarseLevels

Examples

fac <- factor( c("red", "green", NA, "yellow", "notRed", "green") )
cFac <- coarsened( fac,
   levelsList = list("notRed" = c("green", "yellow")) )
print(cFac)
# extraction and replacement
print( cFac[2:3] )
cFac[2:3] <- c("NA", "green") 

cvam documentation built on March 7, 2023, 5:29 p.m.