data.table's wide-to-long reshaping tool.
We provide an S3 method for melting
data.tables. It is written in C for speed and memory
melt.data.table allows melting into
multiple columns simultaneously.
## fast melt a data.table ## S3 method for class 'data.table' melt(data, id.vars, measure.vars, variable.name = "variable", value.name = "value", ..., na.rm = FALSE, variable.factor = TRUE, value.factor = FALSE, verbose = getOption("datatable.verbose"))
vector of id variables. Can be integer (corresponding id column numbers) or character (id column names) vector. If missing, all non-measure columns will be assigned to it. If integer, must be positive; see Details.
Measure variables for
For convenience/clarity in the case of multiple
name for the molten data values column(s). The default name is
any other arguments to be passed to/from other methods.
measure.vars are both missing, all
numeric/integer/logical columns are assigned as id variables and
the rest as measure variables. If only one of
measure.vars is supplied, the rest of the columns will be assigned to
the other. Both
measure.vars can have the same column
more than once and the same column can be both as id and measure variables.
melt.data.table also accepts
list columns for both id and measure
measure.vars are not of the same type, they'll be coerced
according to the hierarchy
integer > logical. For example, if any of the measure variables is a
list, then entire value column will be coerced to a list.
melt gains a feature with
accepting a list of
integer vectors as well to melt
into multiple columns in a single function call efficiently.
If a vector in the list contains missing values, or is shorter than the
max length of the list elements, then the output will include runs of
missing values at the specified position, or at the end.
patterns can be used to provide regular expression patterns. When
used along with
cols argument is not provided, the
patterns will be matched against
names(data), for convenience.
Attributes are preserved if all
value columns are of the same type. By
default, if any of the columns to be melted are of type
be coerced to
character type. To get a
factor column, set
value.factor = TRUE.
melt.data.table also preserves
melt.data.table was originally designed as an enhancement to
reshape2::melt in terms of computing and memory efficiency.
reshape2 has since been superseded in favour of
melt has had a generic defined within
v1.9.6 in 2015, at which point the dependency between the packages became more etymological than programmatic. We thank the
reshape2 authors for the inspiration.
data.table containing the molten data.
set.seed(45) require(data.table) DT <- data.table( i_1 = c(1:5, NA), n_1 = c(NA, 6, 7, 8, 9, 10), f_1 = factor(sample(c(letters[1:3], NA), 6L, TRUE)), f_2 = factor(c("z", "a", "x", "c", "x", "x"), ordered=TRUE), c_1 = sample(c(letters[1:3], NA), 6L, TRUE), c_2 = sample(c(LETTERS[1:2], NA), 6L, TRUE), d_1 = as.Date(c(1:3,NA,4:5), origin="2013-09-01"), d_2 = as.Date(6:1, origin="2012-01-01") ) # add a couple of list cols DT[, l_1 := DT[, list(c=list(rep(i_1, sample(5, 1L)))), by = i_1]$c] DT[, l_2 := DT[, list(c=list(rep(c_1, sample(5, 1L)))), by = i_1]$c] # id.vars, measure.vars as character/integer/numeric vectors melt(DT, id.vars=1:2, measure.vars="f_1") melt(DT, id.vars=c("i_1", "n_1"), measure.vars=3) # same as above melt(DT, id.vars=1:2, measure.vars=3L, value.factor=TRUE) # same, but 'value' is factor melt(DT, id.vars=1:2, measure.vars=3:4, value.factor=TRUE) # 'value' is *ordered* factor # preserves attribute when types are identical, ex: Date melt(DT, id.vars=3:4, measure.vars=c("d_1", "d_2")) melt(DT, id.vars=3:4, measure.vars=c("n_1", "d_1")) # attribute not preserved # on list melt(DT, id.vars=1, measure.vars=c("l_1", "l_2")) # value is a list suppressWarnings( melt(DT, id.vars=1, measure.vars=c("c_1", "l_1")) # c1 coerced to list, with warning ) # on character melt(DT, id.vars=1, measure.vars=c("c_1", "f_1")) # value is char suppressWarnings( melt(DT, id.vars=1, measure.vars=c("c_1", "n_1")) # n_1 coerced to char, with warning ) # on na.rm=TRUE. NAs are removed efficiently, from within C melt(DT, id.vars=1, measure.vars=c("c_1", "c_2"), na.rm=TRUE) # remove NA # measure.vars can be also a list # melt "f_1,f_2" and "d_1,d_2" simultaneously, retain 'factor' attribute # convenient way using internal function patterns() melt(DT, id.vars=1:2, measure.vars=patterns("^f_", "^d_"), value.factor=TRUE) # same as above, but provide list of columns directly by column names or indices melt(DT, id.vars=1:2, measure.vars=list(3:4, c("d_1", "d_2")), value.factor=TRUE) # same as above, but provide names directly: melt(DT, id.vars=1:2, measure.vars=patterns(f="^f_", d="^d_"), value.factor=TRUE) # na.rm=TRUE removes rows with NAs in any 'value' columns melt(DT, id.vars=1:2, measure.vars=patterns("f_", "d_"), value.factor=TRUE, na.rm=TRUE) # 'na.rm=TRUE' also works with list column, but note that is.na only # returns TRUE if the list element is a length=1 vector with an NA. is.na(list(one.NA=NA, two.NA=c(NA,NA))) melt(DT, id.vars=1:2, measure.vars=patterns("l_", "d_"), na.rm=FALSE) melt(DT, id.vars=1:2, measure.vars=patterns("l_", "d_"), na.rm=TRUE) # measure list with missing/short entries results in output with runs of NA DT.missing.cols <- DT[, .(d_1, d_2, c_1, f_2)] melt(DT.missing.cols, measure.vars=list(d=1:2, c="c_1", f=c(NA, "f_2"))) # specifying columns to melt via separator. melt(DT.missing.cols, measure.vars=measure(value.name, number=as.integer, sep="_")) # specifying columns to melt via regex. melt(DT.missing.cols, measure.vars=measure(value.name, number=as.integer, pattern="(.)_(.)"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.