issues/factorissue.md

Factor with R data.frame.

Some issues with summarize_each and factors.

library('dplyr')
 #  
 #  Attaching package: 'dplyr'
 #  The following objects are masked from 'package:stats':
 #  
 #      filter, lag
 #  The following objects are masked from 'package:base':
 #  
 #      intersect, setdiff, setequal, union
R.Version()$version.string
 #  [1] "R version 3.3.2 (2016-10-31)"
packageVersion('dplyr')
 #  [1] '0.5.0'
d1 <- data.frame(y=c('a','b'),stringsAsFactors = FALSE)
d1 %>% dplyr::summarise_each(dplyr::funs(lexmin = min,lexmax = max))
 #    lexmin lexmax
 #  1      a      b
d2 <- data.frame(y=c('a','b'),stringsAsFactors = TRUE)
d2 %>% dplyr::summarise_each(dplyr::funs(lexmin = min,lexmax = max))
 #  Error in eval(expr, envir, enclos): 'min' not meaningful for factors

Submitted as dplyr issue 2269. Closed as "expected behavior" as this is what min(factor(letters)) does. That is a correct determination, but be aware many dplyr backends do support comparison, min, and max on characters types.

my_db <- dplyr::src_sqlite("replyr_sqliteEx.sqlite3", create = TRUE)
dplyr::copy_to(dest=my_db,df=d1,name='d1',overwrite=TRUE) %>% 
  dplyr::summarise_each(dplyr::funs(lexmin = min,lexmax = max))
 #  Source:   query [?? x 2]
 #  Database: sqlite 3.8.6 [replyr_sqliteEx.sqlite3]
 #  
 #    lexmin lexmax
 #     <chr>  <chr>
 #  1      a      b
dplyr::copy_to(dest=my_db,df=d2,name='d2',overwrite=TRUE) %>% 
  dplyr::summarise_each(dplyr::funs(lexmin = min,lexmax = max))
 #  Source:   query [?? x 2]
 #  Database: sqlite 3.8.6 [replyr_sqliteEx.sqlite3]
 #  
 #    lexmin lexmax
 #     <chr>  <chr>
 #  1      a      b
version
 #                 _                           
 #  platform       x86_64-apple-darwin13.4.0   
 #  arch           x86_64                      
 #  os             darwin13.4.0                
 #  system         x86_64, darwin13.4.0        
 #  status                                     
 #  major          3                           
 #  minor          3.2                         
 #  year           2016                        
 #  month          10                          
 #  day            31                          
 #  svn rev        71607                       
 #  language       R                           
 #  version.string R version 3.3.2 (2016-10-31)
 #  nickname       Sincere Pumpkin Patch


WinVector/replyr documentation built on Oct. 22, 2020, 8:07 p.m.