Description Usage Arguments Details Value See Also Examples
rbind
concatenates its arguments by row; see cbind
for basic documentation. There is an rbind
method for data frames which mvbutils
overrides, and rbdf
calls the override directly. The mvbutils
version should behave exactly as the base-R version, with two exceptions:
zero-row arguments are not ignored, e.g. so that factor levels which never appear are not dropped.
dimensioned (array or matrix) elements do not lose any extra attributes (such as class
).
I find the zero-row behaviour more logical, and useful because e.g. it lets me create an empty.data.frame
with the correct type/class/levels for all columns, then subsequently add rows to it. The behaviour for matrix (array) elements allows e.g. the rbind
ing of data frames that contain matrices of POSIXct
elements without losing the POSIXct
class (as in my package nicetime).
When rbind
ing data frames, best practice is to make sure all the arguments really are data frames. Lists and matrices also work OK (they are first coerced to data frames), but scalars are dangerous (even though base-R will process them without complaint). rbind
is quirky around data frames; unless all the arguments are data frames, sometimes rbind.data.frame
will not be called even when you'd expect it to be, and the coercion of scalars is frankly potty; see Details and EXAMPLES. mvbutils:::rbind.data.frame
tries to mimic the base-R scalar coercion, but I'm not sure it's 100% compatible. Again, the safest way to ensure a predictable outcome, is to make sure all arguments really are data frames, and/or to call rbdf
directly.
Note that ("thanks" to stringsAsFactors
) the order in which data frames are rbound can affect the result— see Examples.
Versions of mvbutils
prior to 2.8.207 installed replacements for $<-.data.frame
and [[<-.data.frame
that circumvented weird behaviour with the base-R versions when the data.frame
had zero rows. That weird behaviour seems to be fixed in base-R as of version 3.4.4 (perhaps earlier). I've therefore removed those replacements (after warnings from newer versions RCMD CHECK). Hopefully, everything works... but just for the record, here's the old text, which I think no longer applies.
[I think this paragraph is obsolete.] Normally, you can replace elements in, or add a column to, data frames via e.g. x$y <- z
or x[["y"]] <- z
. However, in base-R this fails for no good reason if x
is a zero-row data frame; the sensible behaviour when y
doesn't exist yet, would be to create a zero-length column of the appropriate class. mvbutils
overrides the base (mis)behaviour so it works sensibly. Should work for matrix/array "replacements" too.
1 2 3 4 5 6 7 8 | rbind(..., deparse.level = 1) # generic
## S3 method for class 'data.frame'
rbind(..., deparse.level = 1) # S3 method for data.frame
rbdf(..., deparse.level = 1) # explicitly call S3 method...
# ... for data frames (circumvent rbind dispatch)
## OBSOLETE x[[i,j]] <- value # S3 method for data.frame; only ...
## OBS ... the version x[[i]] <- value is relevant here, tho' arguably j==0 might be
## OBS x$name <- value # S3 method for data.frame
|
... |
Data frames, or things that will coerced to data frames. NULLs are ignored. |
deparse.level |
not used by |
old arguments
column and row subscripts
column name
that's up to you; I just have to include them here to stop RCMD CHECK from moaning... :/
See cbind
documentation in base-R.
R's dispatch mechanism for rbind
is as follows [my paraphrasing of base-R documentation]. Mostly, if any argument is a data frame then rbind.data.frame
will be used. However, if one argument is a data frame but another argument is a scalar/matrix of a class that has an rbind
method, then "default rbind" will be called instead. Although the latter still returns a data frame, it stuffs up e.g. class attributes, so that POSIXct
objects will be turned into huge numbers. Again, if you really want a data frame result, make sure all the arguments are data frames.
In mvbutils:::rbind.data.frame
(and AFAIK in the base-R version), arguments that are not data frames are coerced to data frames, by calling data.frame()
on them. AFAICS this works predictably for list and matrix arguments; note that lists need names, and matrices need column names, that match the names of the real data frame arguments, because column alignment is done by name not position. Behaviour for scalars is IMO weird; see Examples. The idea seems to be to turn each scalar into a single-row data frame, coercing its names and truncating/replicating it to match the columns of the first real data frame argument; any names
of the scalar itself are disregarded, and alignment is by position not name. Although mvbutils:::rbind.data.frame
tries to mimic this coercion, it seems to me unnecessary (the user should just turn the scalar into something less ambiguous), confusing, and dangerous, so mvbutils
issues a warning. Whether I have duplicated every quirk, I'm not sure.
Note also that R's accursed drop=TRUE
default means that things you might reasonably think should be data frames, might not be. Under some circumstances, this might result in rbind.data.frame
being bypassed. See Examples.
Short of rewriting data.frame
and rbind
, there's nothing mvbutils
can do to fix these quirks. Whether base-R should consider any changes is another story, but back-compatibility probably suggests not.
[Taken from the base-R documentation, modified to fit the mvbutils
version]
The rbind
data frame method first drops any NULL arguments, then coerces all others to data frames (see Details for how it does this with scalars). Then it drops all zero-column arguments. (If that leaves none, it returns a zero-column zero-row data frame.) It then takes the classes of the columns from the first argument, and matches columns by name (rather than by position). Factors have their levels expanded as necessary (in the order of the levels of the levelsets of the factors encountered) and the result is an ordered factor if and only if all the components were ordered factors. (The last point differs from S-PLUS.) Old-style categories (integer vectors with levels) are promoted to factors. Zero-row arguments are kept, so that in particular their column classes and factor levels are taken account of.
Because the class of each column is set by the first data frame, rather than "by consensus", numeric/character/factor conversions can be a bit surprising especially where NAs are involved. See the final bit of EXAMPLES.
cbind
and data.frame
in base-R; empty.data.frame
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | # mvbutils versions are used, unless base:: or baseenv() gets mentioned
# Why base-R dropping of zero rows is odd
rbind( data.frame( x='yes', y=1)[-1,], data.frame( x='no', y=0))$x # mvbutils
#[1] no
#Levels: yes no # two levels
base::rbind( data.frame( x='yes', y=1)[-1,], data.frame( x='no', y=0))$x # base-R
#[1] no
#Levels: no # lost level
rbind( data.frame( x='yes', y=1)[-1,], data.frame( x='no', y=0, stringsAsFactors=FALSE))$x
#[1] no
#Levels: yes no
base::rbind( data.frame( x='yes', y=1)[-1,], data.frame( x='no', y=0, stringsAsFactors=FALSE))$x
#[1] "no" # x has turned into a character
# Quirks of scalar coercion
evalq( rbind( data.frame( x=1), x=2, x=3), baseenv()) # OK I guess
# x
#1 1
#x 2
#x1 3
evalq( rbind( data.frame( x=1), x=2:3), baseenv()) # NB lost element
# x
#1 1
#x 2
evalq( rbind( data.frame( x=1, y=2, z=3), c( x=4, y=5)), baseenv())
# NB gained element! Try predicting z[2]...
# x y z
#1 1 2 3
#2 4 5 4
evalq( rbind( data.frame( x='cat', y='dog'), cbind( x='flea', y='goat')), baseenv()) # OK
# x y
#1 cat dog
#2 flea goat
evalq( rbind( data.frame( x='cat', y='dog'), c( x='flea', y='goat')), baseenv()) # Huh?
#Warning in `[<-.factor`(`*tmp*`, ri, value = "flea") :
# invalid factor level, NAs generated
#Warning in `[<-.factor`(`*tmp*`, ri, value = "goat") :
# invalid factor level, NAs generated
# x y
#1 cat dog
#2 <NA> <NA>
evalq( rbind( data.frame( x='cat', y='dog'), c( x='flea')), baseenv()) # Hmmm...
#Warning in `[<-.factor`(`*tmp*`, ri, value = "flea") :
# invalid factor level, NAs generated
#Warning in `[<-.factor`(`*tmp*`, ri, value = "flea") :
# invalid factor level, NAs generated
# x y
#1 cat dog
#2 <NA> <NA>
try( evalq( rbind( data.frame( x='cat', y='dog'), cbind( x='flea')), baseenv())) # ...mmmm...
#Error in rbind(deparse.level, ...) :
# numbers of columns of arguments do not match
# Data frames that aren't:
data.frame( x=1,y=2)[-1,] # a zero-row DF-- OK
# [1] x y
# <0 rows> (or 0-length row.names)
data.frame( x=1)[-1,] # not a DF!?
# numeric(0)
data.frame( x=1)[-1,,drop=FALSE] # OK, but exceeeeeedingly cumbersome
# <0 rows> (or 0-length row.names)
# Implications for rbind:
rbind( data.frame( x='yes')[-1,], x='no')
# [,1]
# x "no" # rbind.data.frame not called!
rbind( data.frame( x='yes')[-1,,drop=FALSE], x='no')
#Warning in rbind(deparse.level, ...) :
# risky to supply scalar argument(s) to 'rbind.data.frame'
# x
#x no
# Quirks of ordering and character/factor conversion:
rbind( data.frame( x=NA), data.frame( x='yes'))$x
#[1] NA "yes" # character
rbind( data.frame( x=NA_character_), data.frame( x='yes'))$x
#[1] <NA> yes
#Levels: yes # factor!
rbind( data.frame( x='yes'), data.frame( x=NA))$x[2:1]
#[1] <NA> yes
#Levels: yes # factor again
x1 <- data.frame( x='yes', stringsAsFactors=TRUE)
x2 <- data.frame( x='no', stringsAsFactors=FALSE)
rbind( x1, x2)$x
# [1] yes no
# Levels: yes no
rbind( x2, x1)$x
# [1] "no" "yes"
# sigh...
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.