library(magrittr) knit_print.data.frame = function(x, ...) { res = paste(c('', '', knitr::kable(x)), collapse = '\n') knitr::asis_output(res) } `knit_print.{` = function(x, ...) { res <- paste(c("```r", paste0(c("", rep(" ", length(x)-1)), as.character(x)), "}", "```", ""), collapse = '\n') knitr::asis_output(res) }
When loading both plyr and dplyr,
the last package loaded overwrites symbols exported by the
package loaded first:
library(plyr) library(dplyr)
Currently, the following symbols are affected:
plyr_exports <- ls("package:plyr") dplyr_exports <- ls("package:dplyr") (both_exports <- intersect(plyr_exports, dplyr_exports))
This means that existing projects that use plyr
cannot simply load dplyr using library(dplyr)
without potentially breaking existing code.
There are workarounds, but all of them seem to have
specific disadvantages:
dplyr::summarise).dplyr after plyr, modify usage of conflicting symbols
in the codedplyr primitives with plyrThis document explores alternative solutions.
Let's take a closer look at the interface of the functions exported from both packages:
formals_df <- both_exports %>% setNames(nm = .) %>% lapply( . %>% { lapply( c("plyr", "dplyr") %>% setNames(nm = .), function (x) { get(., sprintf("package:%s", x)) %>% formals %>% names %>% paste(collapse = ", ") } ) } ) %>% plyr::ldply(data.frame, stringsAsFactors = FALSE, .id = "name") %>% plyr::mutate(data_first = grepl("^[.]data,", dplyr) | grepl("^df,", plyr), identical = plyr == dplyr) %>% plyr::arrange(!data_first, !identical, name) formals_df
We can split the overlaps in two groups:
formals_df %>% subset(data_first) %>% plyr::mutate(data_first = NULL)
Here, mutate, summari[sz]e and arrange mean the same in both plyr and dplyr,
although plyr::summari[sz]e on a grouped tbl_df seems to
destroy the grouping.
The count function is similar, too, only that plyr::count
is more similar to dplyr::count_, and that for some reason
the first argument of dplyr::count is called x and not .data.
Also, plyr::rename seems to be more similar to dplyr::rename_.
formals_df %>% subset(!data_first & identical) %>% plyr::mutate(data_first = NULL, identical = NULL)
Here, we take a look at the bodies. We compare them for textual identity.
body_l <- formals_df %>% subset(!data_first & identical) %>% extract2("name") %>% as.character %>% setNames(nm = .) %>% lapply( . %>% { lapply( c("plyr", "dplyr") %>% setNames(nm = .), function (x) { get(., sprintf("package:%s", x)) %>% body } ) } )
body_l %>% llply(. %>% { identical(as.character(.$plyr), as.character(.$dplyr)) } ) %>% ldply(. %>% data.frame(identical_body = .), .id = "name")
Only failwith is different, but the effect seems to be the same:
body_l$failwith$plyr body_l$failwith$dplyr
For fully seamless transition between plyr and dplyr,
a compatibility package looks like a possible option.
It would provide the union of the exported functions of both packages,
and compatibility wrappers for the two functions
count and rename
that need special attention.
Deprecating the functions count and rename in the plyr package seems simpler.
For an even more radical solution,
all overlapping functions could be deprecated,
referring to identical functionality in dplyr:
mutate -> dplyr::mutatesummari[sz]e -> dplyr::summari[sz]earrange(df, ...) -> dplyr::arrange(.data, ...)count(df, vars, wt_var) -> dplyr::count_(x, vars, wt, sort = FALSE)n column to freqrename(x, replace, warn_missing) -> dplyr::rename_(.data, replace)replace needs to be massagedwarn_missing is always TRUEdesc(), failwith(), id() -> dplyr::...For practical use, a thin compatibility layer seems to work reasonably well
for a project that was created using plyr and is now transitioning
towards dplyr:
attach(pdlyr::plyr_compat)
Tests in the pdlyr package will assure that this package
can be safely loaded with warn.conflicts = FALSE.
The original plyr implementations can be accessed via shortcuts
(prefix p).
A few usage examples:
mtcars %>% mutate(lphkm = 100 * 3.785411784 / 1.609344 / mpg) %>% head mtcars %>% plyr::mutate(lphkm = 100 * 3.785411784 / 1.609344 / mpg) %>% head mtcars %>% ddply("cyl", summarize, mean_mpg = mean(mpg)) mtcars %>% ddply("cyl", plyr::summarise, mean_mpg = mean(mpg)) mtcars %>% arrange(-wt) %>% head mtcars %>% plyr::arrange(-wt) %>% head mtcars %>% count("gear") mtcars %>% plyr::count("gear") mtcars %>% rename(list(mpg = "miles_per_gallon")) %>% extract(1:2) %>% head mtcars %>% plyr::rename(list(mpg = "miles_per_gallon")) %>% extract(1:2) %>% head
Last changed: r format(Sys.time(), tz="UTC", usetz=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.