README.md

LRO.utilities

R package: Utility functions and addins for RStudio.

Helper tools I use in many projects.

By Ludvig R. Olsen, Cognitive Science, Aarhus University. Started in Feb. 2017

Contact at: [email protected]

Main functions:

Installation

Development version:

install.packages("devtools")

devtools::install_github("LudvigOlsen/LRO.utilities")

Use

Addins

Examples

# Attach packages
library(LRO.utilities)
library(dplyr) # %>% 
library(knitr) # kable()
library(ggplot2)

polynomializer

Exponentiate vectors to make polynomials of degree 2 to degree.

E.g. create columns v^2, v^3, v^4...

# On vector

vect <- c(1,3,5,7,8)

polynomializer(vect, degree = 3) %>% 
  kable()

| vect| vect_2| vect_3| |-----:|--------:|--------:| | 1| 1| 1| | 3| 9| 27| | 5| 25| 125| | 7| 49| 343| | 8| 64| 512|


# On vectors in dataframe

data <- data.frame(vect = vect,
                   bect = vect*3,
                   dect = vect*5)

polynomializer(data, 
               cols = c('bect','dect'), 
               degree = 3) %>% 
  kable()

| vect| bect| dect| bect_2| bect_3| dect_2| dect_3| |-----:|-----:|-----:|--------:|--------:|--------:|--------:| | 1| 3| 5| 9| 27| 25| 125| | 3| 9| 15| 81| 729| 225| 3375| | 5| 15| 25| 225| 3375| 625| 15625| | 7| 21| 35| 441| 9261| 1225| 42875| | 8| 24| 40| 576| 13824| 1600| 64000|

%ni%

"Not in"


c(1,3,5) %ni% c(2,3,6)
#> [1]  TRUE FALSE  TRUE

scaler

Center and/or scale multiple columns of a dataframe.

scaler is designed to work with %>% pipelines.

scaler_fit returns fit_object with information used to transform data.

scaler_transform scales data based on the information in the fit_object.

scaler_invert inverts scaling based on the information in the fit_object.

scaler_ and scaler_fit_ are standard evalution versions.


# Scale and center 'vect' and 'bect' 
# in dataframe from previous example
scaler(data, vect, bect) %>% 
  kable()

| vect| bect| dect| |-----------:|-----------:|-----:| | -1.3270176| -1.3270176| 5| | -0.6285873| -0.6285873| 15| | 0.0698430| 0.0698430| 25| | 0.7682733| 0.7682733| 35| | 1.1174885| 1.1174885| 40|


# Only scaling 'vect'  - working in pipeline
data %>% 
  scaler(vect, center = F) %>% 
  kable()

| vect| bect| dect| |----------:|-----:|-----:| | 0.3492151| 3| 5| | 1.0476454| 9| 15| | 1.7460757| 15| 25| | 2.4445060| 21| 35| | 2.7937212| 24| 40|


# Only center 'bect' and 'dect' 
# selecting with column index range
data %>% 
  scaler(2:3, scale = F) %>% 
  kable()

| vect| bect| dect| |-----:|------:|-----:| | 1| -11.4| -19| | 3| -5.4| -9| | 5| 0.6| 1| | 7| 6.6| 11| | 8| 9.6| 16|

Fit / Transform / Invert
# Fit scaler
fitted_scaler <- data %>% 
  scaler_fit()

fitted_scaler %>% kable()

| column | mean| sd| center | scale | |:-------|-----:|----------:|:-------|:------| | vect | 4.8| 2.863564| TRUE | TRUE | | bect | 14.4| 8.590693| TRUE | TRUE | | dect | 24.0| 14.317821| TRUE | TRUE |


# Transform data
scaled_df <- data %>% 
  scaler_transform(fit_object = fitted_scaler)

scaled_df %>% kable()

| vect| bect| dect| |-----------:|-----------:|-----------:| | -1.3270176| -1.3270176| -1.3270176| | -0.6285873| -0.6285873| -0.6285873| | 0.0698430| 0.0698430| 0.0698430| | 0.7682733| 0.7682733| 0.7682733| | 1.1174885| 1.1174885| 1.1174885|


# Invert scaling
scaled_df %>% 
  scaler_invert(fit_object = fitted_scaler) %>% 
  kable()

| vect| bect| dect| |-----:|-----:|-----:| | 1| 3| 5| | 3| 9| 15| | 5| 15| 25| | 7| 21| 35| | 8| 24| 40|

binarizer

Binarize multiple columns of a dataframe based on a given threshold.

binarizer is designed to work with %>% pipelines.

binarizer_ is a standard evalution version.


scaled_df %>% 
  binarizer(thresh = 0)
#> # A tibble: 5 × 3
#>    vect  bect  dect
#>   <dbl> <dbl> <dbl>
#> 1     0     0     0
#> 2     0     0     0
#> 3     1     1     1
#> 4     1     1     1
#> 5     1     1     1

rename_col

Rename single column in dataframe. This is a bit like plyr::rename just only for 1 column at a time.

rename_col(data, 
           old_name = 'bect', 
           new_name = 'sect') %>% 
  kable()

| vect| sect| dect| |-----:|-----:|-----:| | 1| 3| 5| | 3| 9| 15| | 5| 15| 25| | 7| 21| 35| | 8| 24| 40|

savage_dickey

Calculate Bayes factor from 2 distributions and plot the the two distributions.

Returns list with ggplot2 object, BF10, and BF01.

prior <- rnorm(1000, mean=0, sd=1)
posterior <- rnorm(1000, mean=2, sd=2)

s_d <- savage_dickey(posterior, prior, Q = 0, plot = TRUE)

s_d$BF10
#> [1] 0.274728

s_d$BF01
#> [1] 3.639963

s_d$post_prior_plot +
  theme_bw()

roll_previous

Wrapper for zoo::rollapply for applying a function to rolling windows and getting the result of the previous window. Appends NAs at start only.

# Create dataframe
df <- data.frame('round' = c(1,2,3,4,5,6,7,8,9,10),
                 'score' = c(5,3,4,7,6,5,2,7,8,6))

# For each row we find the mean score of the previous 2 rounds
df$mean_prev_score = roll_previous(df$score, width = 2, FUN = mean)

df %>% kable()

| round| score| mean_prev_score| |------:|------:|------------------:| | 1| 5| NA| | 2| 3| NA| | 3| 4| 4.0| | 4| 7| 3.5| | 5| 6| 5.5| | 6| 5| 6.5| | 7| 2| 5.5| | 8| 7| 3.5| | 9| 8| 4.5| | 10| 6| 7.5|



LudvigOlsen/LRO.utilities documentation built on May 27, 2018, 4:21 p.m.