dcast: Convert data between wide and long forms.

View source: R/dcast.R

dcastR Documentation

Convert data between wide and long forms.

Description

The dcast formula takes the form LHS ~ RHS, ex: var1 + var2 ~ var3. The order of entries in the formula is essential. There are two special variables: . and .... . represents no variable; ... represents all variables not otherwise mentioned in formula. LHS variable values will be in rows. RHS variables values will become column names. fun.aggregate(value.var) will be cell values. For details see dcast and melt.

Usage

dcast(
  data,
  formula,
  fun.aggregate = NULL,
  sep = "_",
  ...,
  margins = NULL,
  subset = NULL,
  fill = NULL,
  drop = TRUE,
  value.var = guess(data),
  verbose = getOption("datatable.verbose")
)

melt(
  data,
  id.vars,
  measure.vars,
  variable.name = "variable",
  value.name = "value",
  ...,
  na.rm = FALSE,
  variable.factor = TRUE,
  value.factor = FALSE,
  verbose = getOption("datatable.verbose")
)

guess(data)

Arguments

data

A data.table/data.frame. data.frame will be automatically converted to data.table.

formula

A formula of the form LHS ~ RHS to cast. For details see dcast.

fun.aggregate

Should the data be aggregated before casting? If the formula doesn't identify a single observation for each cell, then aggregation defaults to length with a message.

sep

Character vector of length 1, indicating the separating character in variable names generated during casting. Default is _ for backwards compatibility.

...

Any other arguments that may be passed to the aggregating function.

margins

For details see dcast.

subset

Specified if casting should be done on a subset of the data.

fill

Value with which to fill missing cells. If fun.aggregate is present, takes the value by applying the function on a 0-length vector.

drop

FALSE will cast by including all missing combinations. c(FALSE, TRUE) will only include all missing combinations of formula LHS. And c(TRUE, FALSE) will only include all missing combinations of formula RHS.

value.var

Name of the column whose values will be filled to cast. Function 'guess()' tries to, well, guess this column automatically, if none is provided. It is possible to cast multiple 'value.var“ columns simultaneously. For details see dcast.

verbose

For details see dcast.

id.vars

vector of id variables. Can be integer (corresponding id column numbers) or character (id column names) vector. If missing, all non-measure columns will be assigned to it. If integer, must be positive; see Details.

measure.vars

Measure variables for melting. Can be missing, vector, list, or pattern-based. For details see dcast.

variable.name

name for the measured variable names column. The default name is 'variable'.

value.name

name for the molten data values column(s). The default name is 'value'. Multiple names can be provided here for the case when measure.vars is a list, though note well that the names provided in measure.vars take precedence.

na.rm

If TRUE, NA values will be removed from the molten data.

variable.factor

If TRUE, the variable column will be converted to factor, else it will be a character column.

value.factor

If TRUE, the value column will be converted to factor, else the molten value type is left unchanged.

Value

data.table

Author(s)

Matt Dowle mattjdowle@gmail.com

Examples

# examples from 'tidyr' package
stocks = data.frame(
    time = as.Date('2009-01-01') + 0:9,
    X = rnorm(10, 0, 1),
    Y = rnorm(10, 0, 2),
    Z = rnorm(10, 0, 4)
)
stocksm = stocks %>%
    melt(id.vars = "time", variable.name = "stock", value.name = "price")
stocksm %>% dcast(time ~ stock)
stocksm %>% dcast(stock ~ time)

# dcast and melt are complements
df = data.frame(x = c("a", "b"), y = c(3, 4), z = c(5, 6))
df %>%
    dcast(z ~ x, value.var = "y") %>%
    melt(id.vars = "z", variable.name = "x", value.name = "y", na.rm = TRUE)

gdemin/maditr documentation built on April 12, 2024, 10 p.m.