Data shaping tools


Generate a new dataset in a format that is accepted by compare. Dummy variables are created for categorical variables.


datashape(dataset, y, x, subs)



a dataframe or matrix containing user data.


the column number of the outcome variable in dataset d.


a vector containing the explanatory or predictor variables of interest. The new data matrix will only contain these variables and the outcome variable.


a vector describing a subset of rows from dataset d to include in the returned data matrix.


This function can be used to prepare a dataset before applying the compare function. The outcome column number must be specified, and specific predictors and observation subsets may be specified. 2-level categorical variables will be converted to binary, while dummy variables will be created for categorical predictors with greater than two levels.

The "datashaped" dataset should be saved to a new object.


This function returns a matrix conforming to the specifications supplied to the datashape function.


datashape will not function if missing values are present.

@examples ## Preparing the iris dataset data(iris) iris.shaped <- datashape(dataset = iris, y = 4) head(iris.shaped) ## Creating a copy of iris with sepal-related predictors and a subset of observations. iris.sub <- datashape(dataset = iris, y = 4, x = c(1,2), subs = c(1:20, 50:70)) head(iris.sub)

Want to suggest features or report bugs for Use the GitHub issue tracker. Vote for new features on Trello.

comments powered by Disqus