The DataFrameFactor class is a subclass of the Factor class where the levels are the rows of a DataFrame. It provides a few methods to mimic the behavior of an actual DataFrame while retaining the memory efficiency of the Factor structure.


DataFrameFactor(x, levels, index=NULL, ...)  # constructor function


x, levels

DataFrame objects. At least one of x and levels must be specified. If index is NULL, both can be specified.

When levels is specified, it must be a DataFrame with no duplicate rows (i.e. anyDuplicated(levels) must return 0).

See ?Factor for more details.


NULL or an integer (or numeric) vector of valid positive indices (no NAs) into levels. See ?Factor for details.


Optional metadata columns.


A DataFrameFactor object.


DataFrameFactor objects support the same set of accessors as Factor objects. In addition, it mimics some aspects of the DataFrame interface. The general principle is that, for these methods, a DataFrameFactor x behaves like the expanded DataFrame unfactor(x).

  • x$name will return column name from levels(x) and expand it according to the indices in x.

  • x[i, j, ..., drop=TRUE] will return a new DataFrameFactor subsetted to entries i, where the levels are subsetted by column to contain only columns j. If the resulting levels only have one column and drop=TRUE, the expanded values of the column are returned directly.

  • dim(x) will return the length of the DataFrameFactor and the number of columns in its levels.

  • dimnames(x) will return the names of the DataFrameFactor and the column names in its levels.


The DataFrame-like methods implemented here are for convenience only. Users should not assume that the DataFrameFactor complies with other aspects of the DataFrame interface, due to fundamental differences between a DataFrame and the Factor parent class, e.g., in the interpretation of their “length”. Outside of the methods listed above, the DataFrameFactor is not guaranteed to work as a drop-in replacement for a DataFrame - use unfactor(x) instead.


Aaron Lun

Factor objects for the parent class.


df <- DataFrame(X=sample(5, 100, replace=TRUE), Y=sample(c("A", "B"), 100, replace=TRUE))
dffac <- DataFrameFactor(df)

dffac[,c("Y", "X")]

# The usual Factor methods may also be used:

