Intro

TO BE DEFINED

Install Packages and Create Dummy Data {.tabset .tabset-fade .tabset-pills}

Install Libraries

For this package we need to install the tidyverse ad DT packages

library(tidyverse)
library(DT)
library(flextable)
library(officer)
library(utildf)


#source(AuxFun2) # this is where the functions are stored  [REVIEW THIS IS NOT WORKING]

Create a new tidy data set

The first step in this guide is to create a set of database that will help us to explain the uses of the functions in this package. We take as starting point the cars database included in the tidyverse package by extracting the models and manufacturer:

#create 10 owners for the example

DfCars.1<-mpg%>%select(manufacturer,model)%>%unique()

#datatable(DfCars.1, rownames = FALSE, filter="top", options = list(pageLenth = 5, scrollX=T))

For this example I choose to add random values to new variables created to comple the dataframe example: mpg miles *year

MkDfCars<-function(){
  DfCars.1<-mpg%>%select(manufacturer,model)%>%unique()
  set.seed(1234)
  DfCars.b<-DfCars.1%>%
    mutate(mpg=sample(10.1:20.0, nrow(DfCars.1), replace=TRUE),
           miles=sample(1000:1100,nrow(DfCars.1), replace=TRUE),
           year=2000,
           car=1)
  set.seed(4567)  
  for (i in 1:2){
    DfCars.z1<-DfCars.1%>%
      mutate(mpg=sample(10.1:20.0, nrow(DfCars.1), replace=TRUE),
             miles=sample(1000:1100,nrow(DfCars.1), replace=TRUE),
             year=2000,
             car=1)
    DfCars.z1<-DfCars.z1%>%mutate(car=i+1)
    DfCars.b<-rbind(DfCars.b,DfCars.z1)
  }
  set.seed(8901)
  for (j in 1:3){
    for (i in 1:10){
      DfCars.a<-DfCars.1%>%
        mutate(mpg=sample(10.1:20.0, nrow(DfCars.1), replace=TRUE),
               miles=sample(1000:2000,nrow(DfCars.1), replace=TRUE),
               year=2000,
               car=1)
      DfCars.a<-DfCars.a%>%mutate(year=year+i,car=j)
      DfCars.b<-rbind(DfCars.b,DfCars.a)
    }
  }
  DfCars<<-DfCars.b
}

MkDfCars()
#DfCars
datatable(DfCars, rownames = FALSE, filter="top", options = list(pageLenth = 5, scrollX=T))

However this dataframe is not "Tidy". For that we gather the data and all atomic valuea are moved to a new column. This will create two new columns:

Additional change is needed now on the dates, as needed in next steps. This is because the gather/ spread function does not work well with dates [UNDER REVIEW].

MkDfCarsTidy<-function(){
  DfCars.b<-DfCars%>%#select(-car)%>%
    gather("measure","value",-year,-manufacturer,-model,-car)%>%
    group_by(year,manufacturer,model,measure,car)%>%
    #mutate(value=mean(value))%>%
    unique()%>%ungroup()

  DfCarsTidy<<-DfCars.b%>%mutate(year=as.factor(sub("^","year.",year)))
  }
MkDfCarsTidy()
#DfCarsTidy
datatable(DfCarsTidy, rownames = FALSE, filter="top", options = list(pageLenth = 5, scrollX=T))

Change Function

In this example we create first a filter for the years 2000 and 2001 for the model A4

x<-DfCarsTidy%>%filter(year%in%c("year.2000","year.2001"),model=="a4")
datatable(x, rownames = FALSE, filter="top", options = list(pageLenth = 5, scrollX=T))

Then we calculate the changes between the two years (2000 and 2001)

x2<-utildf::addcht(x,VarA="year.2001",VarB="year.2000",Var="year",Value="value")
datatable(x2, rownames = FALSE, filter="top", options = list(pageLenth = 5, scrollX=T))

Finally we create a table with the dates and changes in columns for analysis

x3<-x2%>%
  spread(key=year,value=Value)
datatable(x3, rownames = FALSE, filter="top", options = list(pageLenth = 10, scrollX=T))

Add Ranks

Introduction

Rank Example

Here an example in that we filter first the year 2001 and 2000 and miles and spread them as we will add rank to 2001.

We use the addRank function to add rank column to the table for the rank on 2001.miles on the previously calculated example of the function addCh.

x4<-utildf::addrank(df=x3,Var =  "year.2001")
datatable(x4, rownames = FALSE, filter="top", options = list(pageLenth = 10, scrollX=T))

ADD NAME OF THE VAR RANKED TO THE RANK COLUMN NAME

Add Weights

This function can be used to add weighted average metrics This function requires the Weight variable to be provided, which can be raw data or weight (1%-99%), in both cases the sum of the weight*Var will be devided by sum(weights) the VarCero exclude from the weights the Var with cero value, meaning that only consider the ones with values other than cero!!!

Example of AddW function

The first example takes the transformation used in the AddRanks example based in the DfCarsTidy data. The variable that we want to weight is the "mpg" and weighted by "miles".

x<-DfCarsTidy%>%
  select(manufacturer,model,year,car,measure,value)%>%spread(key=measure,value = value)
x
x1<-utildf::addw(df=x,Var="mpg",Weight="miles")
x1
x1<-utildf::addw(df=x,Var="mpg",Weight="miles",Group=c("model","year"),FilterVarCero = FALSE)
x1

Totals Functions

Total Functions

This addtasbr function adds totals values to the database and creates a new row with the total of each numeric column selected.

Example Function "gettotalenq""

x4.1<-utildf::gettotalenq(df=x4,Chng_year.2001_year.2000,year.2001,year.2000)

datatable(x4.1, rownames = FALSE, filter="top", options = list(pageLenth = 10, scrollX=T))

Example Function "gettotalesyms""

x4.1<-utildf::gettotalsym(df=x4,"Chng_year.2001_year.2000","year.2001","year.2000")

datatable(x4.1, rownames = FALSE, filter="top", options = list(pageLenth = 10, scrollX=T))

Example of the "addtasbr" function

x5.1<-utildf::addtasbr(df=x4,Rank=MyRank,RankT=3,Chng_year.2001_year.2000,year.2001,year.2000)

datatable(x5.1, rownames = FALSE, filter="top", options = list(pageLenth = 10, scrollX=T))

x5.2<-utildf::addtasbr(df=x4,Rank=MyRank,RankT=8,Chng_year.2001_year.2000,year.2001,year.2000)

datatable(x5.2, rownames = FALSE, filter="top", options = list(pageLenth = 10, scrollX=T))

Test Flexitable output

x5.1
flextable(x5.1)%>%autofit(add_w = 0,add_h = 0)%>%border(i=3,border.bottom = fp_border(color="black"))

x5.2
flextable(x5.2)%>%
  autofit(add_w = 0,add_h = 0)%>%border(i=nrow(x5.2)-1,border.bottom = fp_border(color="black"))%>%
  autofit(add_w = 0,add_h = 0)%>%border(i=nrow(x5.2)-2,border.bottom = fp_border(color="black"))


aalvaradob/utildf documentation built on Oct. 7, 2020, 3:05 p.m.