setattr: Set attributes of objects by reference

View source: R/data.table.R

setattrR Documentation

Set attributes of objects by reference

Description

In data.table, all set* functions change their input by reference. That is, no copy is made at all, other than temporary working memory which is as large as one column. The only other data.table operator that modifies input by reference is :=. Check out the See Also section below for other set* function that data.table provides.

Usage

setattr(x,name,value)
setnames(x,old,new,skip_absent=FALSE)

Arguments

x

setnames accepts data.frame and data.table. setattr accepts any input; e.g, list, columns of a data.frame or data.table.

name

The character attribute name.

value

The value to assign to the attribute or NULL removes the attribute, if present.

old

When new is provided, character names or numeric positions of column names to change. When new is not provided, a function or the new column names (i.e., it's implicitly treated as new; excluding old and explicitly naming new is equivalent). If a function, it will be called with the current column names and is supposed to return the new column names. The new column names must be the same length as the number of columns. See examples.

new

Optional. It can be a function or the new column names. If a function, it will be called with old and expected to return the new column names. The new column names must be the same length as columns provided to old argument. Missing values in new mean to not rename that column, note: missing values are only allowed when old is not provided.

skip_absent

Skip items in old that are missing (i.e. absent) in 'names(x)'. Default FALSE halts with error if any are missing.

Details

setnames operates on data.table and data.frame not other types like list and vector. It can be used to change names by name with built-in checks and warnings (e.g., if any old names are missing or appear more than once).

setattr is a more general function that allows setting of any attribute to an object by reference.

A very welcome change in R 3.1+ was that 'names<-' and 'colnames<-' no longer copy the entire object as they used to (up to 4 times), see examples below. They now take a shallow copy. The ‘set*' functions in data.table are still useful because they don’t even take a shallow copy. This allows changing names and attributes of a (usually very large) data.table in the global environment from within functions. Like a database.

Value

The input is modified by reference, and returned (invisibly) so it can be used in compound statements; e.g., setnames(DT,"V1", "Y")[, .N, by=Y]. If you require a copy, take a copy first (using DT2=copy(DT)). See ?copy.

Note that setattr is also in package bit. Both packages merely expose R's internal setAttrib function at C level but differ in return value. bit::setattr returns NULL (invisibly) to remind you the function is used for its side effect. data.table::setattr returns the changed object (invisibly) for use in compound statements.

See Also

data.table, setkey, setorder, setcolorder, set, :=, setDT, setDF, copy

Examples


DT <- data.table(a = 1, b = 2, d = 3)

old <- c("a", "b", "c", "d")
new <- c("A", "B", "C", "D")

setnames(DT, old, new, skip_absent = TRUE) # skips old[3] because "c" is not a column name of DT

DF = data.frame(a=1:2,b=3:4)       # base data.frame to demo copies and syntax
if (capabilities()["profmem"])     # usually memory profiling is available but just in case
  tracemem(DF)
colnames(DF)[1] <- "A"             # 4 shallow copies (R >= 3.1, was 4 deep copies before)
names(DF)[1] <- "A"                # 3 shallow copies
names(DF) <- c("A", "b")           # 1 shallow copy
`names<-`(DF,c("A","b"))           # 1 shallow copy

DT = data.table(a=1:2,b=3:4,c=5:6) # compare to data.table
if (capabilities()["profmem"])
  tracemem(DT)                     # by reference, no deep or shallow copies
setnames(DT,"b","B")               # by name, no match() needed (warning if "b" is missing)
setnames(DT,3,"C")                 # by position with warning if 3 > ncol(DT)
setnames(DT,2:3,c("D","E"))        # multiple
setnames(DT,c("a","E"),c("A","F")) # multiple by name (warning if either "a" or "E" is missing)
setnames(DT,c("X","Y","Z"))        # replace all (length of names must be == ncol(DT))
setnames(DT,tolower)               # replace all names with their lower case
setnames(DT,2:3,toupper)           # replace the 2nd and 3rd names with their upper case

DT <- data.table(x = 1:3, y = 4:6, z = 7:9)
setnames(DT, -2, c("a", "b"))      # NEW FR #1443, allows -ve indices in 'old' argument

DT = data.table(a=1:3, b=4:6)
f = function(...) {
    # ...
    setattr(DT,"myFlag",TRUE)  # by reference
    # ...
    localDT = copy(DT)
    setattr(localDT,"myFlag2",TRUE)
    # ...
    invisible()
}
f()
attr(DT,"myFlag")   # TRUE
attr(DT,"myFlag2")  # NULL


data.table documentation built on Oct. 10, 2024, 5:07 p.m.