library(dplyr) library(devtools) # devtools::install_github("STAT545-UBC-students/hw07-Sukeysun")
Factors are a very useful type of variable in R, but they can also drive you nuts. Especially the "stealth factor" that you think of as character.
Can we soften some of their sharp edges?
Binding two factors via fbind()
:
library(foofactors) a <- factor(c("character", "hits", "your", "eyeballs")) b <- factor(c("but", "integer", "where it", "counts"))
Simply catenating two factors leads to a result that most don't expect.
c(a, b)
The fbind()
function glues two factors together and returns factor.
fbind(a, b)
Often we want a table of frequencies for the levels of a factor. The base table()
function returns an object of class table
, which can be inconvenient for downstream work. Processing with as.data.frame()
can be helpful but it's a bit clunky.
set.seed(1234) x <- factor(sample(letters[1:5], size = 100, replace = TRUE)) table(x) as.data.frame(table(x))
The freq_out()
function returns a frequency table as a well-named tbl_df
:
freq_out(x)
The detect_factor
function is a function to test whether the factor is character.
x <- c('a', 'b', 'b') x_fact <- factor(x) y <- c('a', 'b') y_fact <- factor(y) detect_factor(x_fact) detect_factor(y_fact)
x_fact is factor while y_fact is character as the length of unique values of y_fact is equal to the length of y_fact
The reorder_factor
returns the input factor in descending order
x <- c('c', 'b', 'd') x_fact <- factor(x) y <- c('c','c','d') y_fact <- factor(y) levels(y) <- c('d','c') print("The original order of factor") levels(x_fact) levels(y_fact) print("Reorder the input factor") reorder_factor(x_fact) reorder_factor(y_fact)
This function sets levels to the order in which they appear in the data, i.e. set the levels “as is”
x <- c("b","c","a") x_fact <- factor(x) print("The original order of factor") levels(x_fact) print("Set factor") set_factor(x_fact)
The original order of x levles is set in an increasing order. Now the it is set as the sequence that each level appears.
read data frames to plain text delimited files while retaining factor levels
gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt" df <- read_dataframe(gdURL) data.frame( variable = names( df ), classes = sapply( df, class ), factorlevel = sapply( df, nlevels ), first_values = sapply( df, function( x ) paste0( head( x ), collapse = ", ") ), row.names = NULL ) %>% knitr::kable()
We can see that the country and continent is read as factor.
write data frames to plain text delimited files while retaining factor levels. The dataframe will be named as "written_dataframe.txt"
df <- data.frame(x = (c("a","b","c")),y=c("d","e","f")) write_dataframe(df)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.