melt.data.table: Fast melt for data.table

Description Usage Arguments Details Value Note See Also Examples

Description

A melt.data.table S3 method extending reshape2:::melt, for melting a data.table. reshape2 also has to be loaded for using melt.data.table. A lot similar to reshape2:::melt.data.frame , but much faster and with some additional features.

Usage

1
2
3
4
5
6
7
## fast melt a data.table
## S3 method for class 'data.table'
melt(data, id.vars = NULL, measure.vars = NULL, 
	variable.name = "variable", value.name = "value", 
	..., na.rm = FALSE, variable.factor = TRUE, 
	value.factor = FALSE, 
	verbose = getOption("datatable.verbose"))

Arguments

data

A data.table object to melt.

id.vars

vector of id variables. Can be integer (corresponding id column numbers) or character (id column names) vector. If NULL, all non-measure columns will be assigned to it.

measure.vars

vector of measure variables. Can be integer (corresponding measue column numbers) or character (measure column names) vector. If NULL, all non-id columns will be assigned to it.

variable.name

name for the measured variable names column. The default name is 'variable'.

value.name

name for the molten data values column. The default name is 'value'.

na.rm

If TRUE, NA values will be removed from the molten data.

variable.factor

If TRUE, the variable column will be converted to factor, else it will be a character column.

value.factor

If TRUE, the value column will be converted to factor, else the molten value type is left unchanged.

verbose

TRUE turns on status and information messages to the console. Turn this on by default using options(datatable.verbose=TRUE). The quantity and types of verbosity may be expanded in future.

...

any other arguments to be passed to/from other methods

Details

If id.vars and measure.vars are both NULL, all non-numeric/integer/logical columns are assigned as id variables and the rest of the columns are assigned as measure variables. If only one of id.vars or measure.vars is supplied, the rest of the columns will be assigned to the other. Both id.vars and measure.vars can have the same column more than once and same column can be as id and measure variables.

melt.data.table also accepts list columns for both id and measure variables. When all measure.vars are not of the same type, they'll be coerced according to the hierarchy list > character > numeric > integer > logical. For example, any of the measure variables is a list, then entire value column will be coerced to a list. Note that, if the type of value column is a list, na.rm = TRUE will have no effect.

All class attributes on value column (example: Date) are dropped silently.

Value

An unkeyed data.table containing the molten data.

Note

Differences between melt.data.table and reshape2:::melt.data.frame:

  1. There are two other arguments variable.factor and value.factor which for backwards compatibility with reshape2:::melt.data.frame is set to TRUE and FALSE respectively.

  2. melt.data.table can handle list columns in both id and measure vairables. The molten data retains list columns as such. As long as at least one measure.vars is a list, the value column of molten data will be a list.

  3. melt(data, id=integer(0), measure=integer(0)) gives a data.table with 0 rows and 2 columns - variable and value (default names), as opposed to reshape2:::melt.data.frame which gives a 0 columns and nrow(data) rows.

See Also

dcast.data.table, https://r-forge.r-project.org/projects/datatable/

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
set.seed(45)
require(reshape2)
require(data.table)
DT <- data.table(
      i1 = c(1:5, NA), 
      i2 = c(NA,6,7,8,9,10), 
      f1 = factor(sample(c(letters[1:3], NA), 6, TRUE)), 
      c1 = sample(c(letters[1:3], NA), 6, TRUE), 
      d1 = as.Date(c(1:3,NA,4:5), origin="2013-09-01"), 
      d2 = as.Date(6:1, origin="2012-01-01"))
DT[, l1 := DT[, list(c=list(rep(i1, sample(5,1)))), by = i1]$c] # list cols
DT[, l2 := DT[, list(c=list(rep(c1, sample(5,1)))), by = i1]$c]

# basic examples
melt(DT, id=1:2, measure=3) 
melt(DT, id=c("i1", "i2"), measure="f1", value.factor=TRUE) # same as above, but value is factor

# on Date
melt(DT, id=c("i1", "f1"), measure=c("d1", "d2")) # date class attribute lost
melt(DT, id=c("i1", "f1"), measure=c("c1", "d1")) # value is char, date attribute lost

# on list
melt(DT, id=1, measure=c("l1", "l2")) # value is a list
melt(DT, id=1, measure=c("c1", "l1")) # c1 coerced to list

# on character
melt(DT, id=1, measure=c("c1", "f1")) # value is char
melt(DT, id=1, measure=c("c1", "i2")) # i2 coerced to char

# on na.rm=TRUE
melt(DT, id=1, measure=c("c1", "i2"), na.rm=TRUE) # remove NA

Example output

Loading required package: reshape2

Attaching package: 'reshape2'

The following objects are masked from 'package:data.table':

    dcast, melt

   i1 i2 variable value
1:  1 NA       f1     c
2:  2  6       f1     b
3:  3  7       f1     a
4:  4  8       f1     b
5:  5  9       f1     b
6: NA 10       f1     b
   i1 i2 variable value
1:  1 NA       f1     c
2:  2  6       f1     b
3:  3  7       f1     a
4:  4  8       f1     b
5:  5  9       f1     b
6: NA 10       f1     b
    i1 f1 variable      value
 1:  1  c       d1 2013-09-02
 2:  2  b       d1 2013-09-03
 3:  3  a       d1 2013-09-04
 4:  4  b       d1       <NA>
 5:  5  b       d1 2013-09-05
 6: NA  b       d1 2013-09-06
 7:  1  c       d2 2012-01-07
 8:  2  b       d2 2012-01-06
 9:  3  a       d2 2012-01-05
10:  4  b       d2 2012-01-04
11:  5  b       d2 2012-01-03
12: NA  b       d2 2012-01-02
    i1 f1 variable value
 1:  1  c       c1     a
 2:  2  b       c1     c
 3:  3  a       c1     a
 4:  4  b       c1     a
 5:  5  b       c1     b
 6: NA  b       c1    NA
 7:  1  c       d1 15950
 8:  2  b       d1 15951
 9:  3  a       d1 15952
10:  4  b       d1    NA
11:  5  b       d1 15953
12: NA  b       d1 15954
Warning message:
In melt.data.table(DT, id = c("i1", "f1"), measure = c("c1", "d1")) :
  'measure.vars' [c1, d1] are not all of the same type. By order of hierarchy, the molten data value column will be of type 'character'. All measure variables not of type 'character' will be coerced to. Check DETAILS in ?melt.data.table for more on coercion.
    i1 variable       value
 1:  1       l1         1,1
 2:  2       l1       2,2,2
 3:  3       l1       3,3,3
 4:  4       l1       4,4,4
 5:  5       l1   5,5,5,5,5
 6: NA       l1          NA
 7:  1       l2       a,a,a
 8:  2       l2         c,c
 9:  3       l2           a
10:  4       l2         a,a
11:  5       l2       b,b,b
12: NA       l2 NA,NA,NA,NA
    i1 variable     value
 1:  1       c1         a
 2:  2       c1         c
 3:  3       c1         a
 4:  4       c1         a
 5:  5       c1         b
 6: NA       c1        NA
 7:  1       l1       1,1
 8:  2       l1     2,2,2
 9:  3       l1     3,3,3
10:  4       l1     4,4,4
11:  5       l1 5,5,5,5,5
12: NA       l1        NA
Warning message:
In melt.data.table(DT, id = 1, measure = c("c1", "l1")) :
  'measure.vars' [c1, l1] are not all of the same type. By order of hierarchy, the molten data value column will be of type 'list'. All measure variables not of type 'list' will be coerced to. Check DETAILS in ?melt.data.table for more on coercion.
    i1 variable value
 1:  1       c1     a
 2:  2       c1     c
 3:  3       c1     a
 4:  4       c1     a
 5:  5       c1     b
 6: NA       c1    NA
 7:  1       f1     c
 8:  2       f1     b
 9:  3       f1     a
10:  4       f1     b
11:  5       f1     b
12: NA       f1     b
    i1 variable value
 1:  1       c1     a
 2:  2       c1     c
 3:  3       c1     a
 4:  4       c1     a
 5:  5       c1     b
 6: NA       c1    NA
 7:  1       i2    NA
 8:  2       i2     6
 9:  3       i2     7
10:  4       i2     8
11:  5       i2     9
12: NA       i2    10
Warning message:
In melt.data.table(DT, id = 1, measure = c("c1", "i2")) :
  'measure.vars' [c1, i2] are not all of the same type. By order of hierarchy, the molten data value column will be of type 'character'. All measure variables not of type 'character' will be coerced to. Check DETAILS in ?melt.data.table for more on coercion.
    i1 variable value
 1:  1       c1     a
 2:  2       c1     c
 3:  3       c1     a
 4:  4       c1     a
 5:  5       c1     b
 6:  2       i2     6
 7:  3       i2     7
 8:  4       i2     8
 9:  5       i2     9
10: NA       i2    10
Warning message:
In melt.data.table(DT, id = 1, measure = c("c1", "i2"), na.rm = TRUE) :
  'measure.vars' [c1, i2] are not all of the same type. By order of hierarchy, the molten data value column will be of type 'character'. All measure variables not of type 'character' will be coerced to. Check DETAILS in ?melt.data.table for more on coercion.

data.table documentation built on May 2, 2019, 4:57 p.m.