Installtion

library(dplyr)
library(devtools)
# devtools::install_github("STAT545-UBC-students/hw07-Sukeysun")

Functions in foofactor

Factors are a very useful type of variable in R, but they can also drive you nuts. Especially the "stealth factor" that you think of as character.

Can we soften some of their sharp edges?

fbind

Binding two factors via fbind():

library(foofactors)
a <- factor(c("character", "hits", "your", "eyeballs"))
b <- factor(c("but", "integer", "where it", "counts"))

Simply catenating two factors leads to a result that most don't expect.

c(a, b)

The fbind() function glues two factors together and returns factor.

fbind(a, b)

Often we want a table of frequencies for the levels of a factor. The base table() function returns an object of class table, which can be inconvenient for downstream work. Processing with as.data.frame() can be helpful but it's a bit clunky.

set.seed(1234)
x <- factor(sample(letters[1:5], size = 100, replace = TRUE))
table(x)
as.data.frame(table(x))

freq_out

The freq_out() function returns a frequency table as a well-named tbl_df:

freq_out(x)

detect_factor

The detect_factor function is a function to test whether the factor is character.

x <- c('a', 'b', 'b')
x_fact <- factor(x)
y <- c('a', 'b')
y_fact <- factor(y)
detect_factor(x_fact)
detect_factor(y_fact)

x_fact is factor while y_fact is character as the length of unique values of y_fact is equal to the length of y_fact

reorder_factor

The reorder_factor returns the input factor in descending order

x <- c('c', 'b', 'd')
x_fact <- factor(x)
y <- c('c','c','d')
y_fact <- factor(y)
levels(y) <- c('d','c')
print("The original order of factor")
levels(x_fact)
levels(y_fact)
print("Reorder the input factor")
reorder_factor(x_fact)
reorder_factor(y_fact)

set_factor

This function sets levels to the order in which they appear in the data, i.e. set the levels “as is”

x <- c("b","c","a")
x_fact <- factor(x)
print("The original order of factor")
levels(x_fact)
print("Set factor")
set_factor(x_fact)

The original order of x levles is set in an increasing order. Now the it is set as the sequence that each level appears.

read_dataframe

read data frames to plain text delimited files while retaining factor levels

gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt"
df <- read_dataframe(gdURL)


data.frame( variable = names( df ),
           classes = sapply( df, class ),
           factorlevel = sapply( df, nlevels ),
           first_values = sapply( df, function( x ) paste0( head( x ),  collapse = ", ") ),
           row.names = NULL ) %>%
  knitr::kable()

We can see that the country and continent is read as factor.

write_dataframe

write data frames to plain text delimited files while retaining factor levels. The dataframe will be named as "written_dataframe.txt"

df <-  data.frame(x = (c("a","b","c")),y=c("d","e","f"))
write_dataframe(df)


STAT545-UBC-students/hw07-Sukeysun documentation built on May 24, 2019, 7:53 a.m.