Although R has many tools for counts (notably the table function), freqTable provides additional functionality leveraging methods within data.tables.
library(akmisc) library(data.table) # example data set set.seed(111) i_nrow <- 500 dt_data <- data.table( var1 = rbinom( i_nrow, size = 2, prob = 0.1), var2 = rbinom( i_nrow, size = 1, prob = 0.5)) freqTable(dt_data, "var1") freqTable(dt_data, "var1", b_total_row = TRUE) freqTable(dt_data, "var1", b_total_row = TRUE,b_include_perc = TRUE,s_order_by = "ascending") freqTable(dt_data, c("var1", "var2"), b_total_row = TRUE,b_include_perc = TRUE,s_order_by = "ascending")
Discretization or categorization function in R tend to be wrappers for cut which does not handle a mix of cases where part of the intervals are closed to the left and others to the right. For example, we might want to categorize by negative numbers, zero, and positive numbers as follows. $(-\infty, 0), [0,0], (0, \infty)$. categorizeByIntervals provides this functionality.
ci_intervals <- CategorizationIntervals(value = c(-1,0,1), v_s_intervals = c("(-Inf,0)","[0,0]","(0,Inf)")) v_n_values <- seq(from= -1, to = 1, by = 0.5 ) v_n_values categorizeByIntervals(v_n_values, ci_intervals) ci_intervals <- CategorizationIntervals(value = c("Negative","Zero","Positive"), v_s_intervals = c("(-Inf,0)","[0,0]","(0,Inf)")) categorizeByIntervals(v_n_values, ci_intervals)
There are cases when merging two data sets which share columns is desired. mergeWithColPrioritization allows this letting you indicate which data set has a priority. See following example.
i_num <- 3 dt_prior <- data.table( id = 1:i_num, var1 = letters[1:i_num], var2 = 111) dt_other <- data.table( id = 1:(i_num+2), var1 = paste0(letters[10 + 1:(i_num+2)],letters[10 + 1:(i_num+2)]) , var3 = -999) v_s_keys <- "id" # both dt_prior and dt_other have "var1" dt_prior dt_other # now we merge this indicating that values from dt_prior have priority mergeWithColPrioritization(dt_prior, dt_other,v_s_keys ) # thus for observations with *id* c(1,2,3) we use values from dt_prior and for other rows we use dt_other
isColsUniqueIdentifier indicates if given fields allow for unique identification of rows i.e. if they can be tested as keys of the data.table.
dt_data <- data.table(var1 = c(1,1,2,2) , var2 = c("a", "b" , "a","b") ) dt_data isColsUniqueIdentifier(dt_data, v_s_cols = "var1") isColsUniqueIdentifier(dt_data, v_s_cols = c("var1", "var2"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.