data.list: Create a data list

Description Usage Arguments Details Value Note See Also Examples

View source: R/data.list.R


Function for creating a data.list object from vectors, matrices, arrays, data frames, lists, and information on how these objects are related.


data.list(..., dimids, match.dimids, check = TRUE, 
drop = TRUE, unique = TRUE)



A comma-separated collection of vectors, matrices, arrays, data frames, and lists, containing the variables that will comprise the resulting data list.


An optional vector of character strings giving identifiers for the replication dimensions of the data list. If missing, these identifiers are either set to c("D1","D2",...) or determined from information passed to the match.dimids argument if it is not missing.


A (possibly named) list of character vectors, each associated with the elements of ..., giving identifiers for the replication dimensions of these elements. The only replication dimension of a data frame is along the rows, as columns of data frames are taken to be variables. Data frames are split into their component vectors in the returned data list. If missing, it is determined automatically if possible. See details on how automatic dimension matching is done.


If TRUE, the structure of the created data list is checked for consistency.


If TRUE, single dimension data lists are coerced to data frames (i.e. their replication dimensions are 'dropped').


If TRUE, variable names are forced to be unique via make.names.


This function creates data lists, which are multiple-table extensions of data.frames. With the data.frame function, a collection of vectors (of identical length) containing data are combined into a single object that can be passed to model-fitting and plotting functions. In contrast, the data.list function allows not just vectors of the same length in the collection but matrices and arrays with possibly different dimensions.

The data.list function creates objects of class data.list, which are collections of variables (i.e. vectors, matrices, and arrays). These variables are related because they share dimensions of replication. For example, a matrix-valued variable might share its first dimension with the only dimension of a vector-valued variable. See vignette("multitable") for more information on the structure of data lists.

The ... argument accepts a collection of vectors, matrices, arrays, data frames, and lists to be converted to a data list. These different types of objects are used by data.list in different ways:


Becomes a variable in the resulting data list with a single dimension of replication. In particular, a vector without a dimension attribute is converted to a one-dimensional array.


Becomes a variable in the resulting data list with two dimensions of replication.


Becomes a variable in the resulting data list with the same number of dimensions as the array itself.


Each column becomes a variable with a single dimension.


Each element becomes a variable. It is required that each element be either a vector, matrix, or array, and that they all have the same value for their dimension attributes.

The pattern of dimension sharing between the variables is either determined automatically (if match.dimids is missing) or supplied by passing a list via the match.dimids argument. Automatic dimension matching proceeds in two steps. First, data.list tries to deduce the pattern of dimension matching through the names of the dimensions of the objects passed to .... Different names are used for the different types of objects:


The names attribute.

matrix or array

The dimnames attribute.


The row.names attribute.


Either the names or dimnames attribute, depending on which one its elements posses.

For example, if the names attribute of a vector is identical to the first element in the dimnames attribute of an array, then the single dimension of this vector is matched with the first dimension of this array. Dimension matching by naming is the recommended method, because it requires thought about the relationships between the variables and therefore ensures that the structure of the data are well-understood.

If dimension matching via the names of the dimensions fails, then data.list tries to infer the pattern of matching by the sizes of the dimensions of its variables. This method will fail if (1) any object has two or more dimensions of the same size AND (2) at least one other object also has at least one other dimension of the same size. In the case of failure, an error message is reported suggesting that either the dimensions of the variables be named or that explicit dimension matching be supplied as a list via the match.dimids argument.

Each element of the list passed to match.dimids (i.e. match dimension identifiers) is associated with one of the objects in the collection (e.g. the first element corresponds to the first object in the collection). In particular, the elements in match.dimids specify which dimensions the associated objects are replicated along. Each element in match.dimids should consist of a vector of character strings identifying the dimensions in the corresponding object in .... Dimensions in different objects will be considered shared if they share the same identifier passed to match.dimids. The specification of dimension identifiers depends on the associated type of object passed to ...:


A single string identifying the only dimension.


A length-2 character vector identifying the first and second dimensions.


A length-n character vector identifying the n dimensions.


A single string identifying the dimension associated with the rows. Each column is given the same dimension identifier.


A length-n character vector identifying the n dimensions of the elements of the list. Each element is given the same set of dimension identifiers.

To form a valid data list, at least one of the objects in ... must be replicated along all dimensions.

During the production of a data list, one variable is singled out as the 'benchmark' variable. See bm for further details on the benchmark concept. Note that the dimensions of each variable are permuted such that their order matches that of the benchmark.


A data.list object which is a list with one element for each variable passed via ... (note that each column in a data frame is treated as a separate variable, as is each element in a list). Each variable is given a "subsetdim" attribute, which is a logical vector with each element corresponding to one of the dimensions in the benchmark variable. TRUE elements specify that this variable is replicated along the corresponding dimension, and FALSE indicates otherwise. The data.list object itself also contains the following attributes:


Names of the variables


A list of vectors giving the names of the dimensions of replication for each variable (one vector per variable).


The index of the benchmark variable (see bm)


The replication dimensions (equal to the dimension attribute of the benchmark variable)


The data.list function is largely a wrapper for that lets objects be combined into a data list via a ... argument, as is done in data.frame.

See Also for coercing to a data frame, for subscripting the multiple tables in a data list simultaneously, and,, nvar, varnames, and for other methods for data.list objects. If your data are originally in (database-like) ‘long’ format data frames, then use dlcast for creating data lists. If your data are originally in text files, use read.multitable.


## Automatic dimension matching by the sizes of dimensions.
## Note that this example would not work if all of the 10's were
## changed to 5's.  This example also illustrates how to pass
## several variables through one list, as long as each variable
## shares the same dimensions (in this case 10-by-5 matrices).
## We also see here how \code{data.list} automatically converts 
## character vectors to factors.
a1 <- matrix(runif(50), 10, 5)
a2 <- matrix(runif(50), 10, 5)
a3 <- matrix(runif(50), 10, 5)
a <- list(a1, a2, a3)
b <- runif(10)
c <- letters[1:5]
data.list(a, b, c)

## Here we illustrate the use of dimension matching by
## dimension naming.
a <- lapply(a, `dimnames<-`, list(letters[1:10], LETTERS[1:5]))
names(b) <- letters[1:10]
names(c) <- LETTERS[1:5]
data.list(a, b, c)

## If we want to name the dimension identifiers themselves
## we can use \code{dimids}.
data.list(a, b, c, dimids = c("small letters", "large letters"))

## Or we could explicitly specify the pattern of dimension
## sharing using \code{match.dimids}.
md <- list(
	c("small letters", "large letters"),
	"small letters",
	"large letters")
data.list(a, b, c, match.dimids = md)

stevencarlislewalker/multitable documentation built on May 26, 2017, 7:59 p.m.