computeWeightedMeans: Compute Weighted Mean by Group

Description Usage Arguments Author(s) Examples

View source: R/computeWeightedMeans.R

Description

This function computes the weighted mean of variable groups from a data.table. computeWeightedMean is performance optimized and designed to work well in bulk operations. The function returns a data.table.

Usage

1
computeWeightedMeans(data_table, variables, weight, by)

Arguments

data_table

a data.table

variables

character name of the variable(s) to focus on. The variables must be in the data.table

weight

character name of the data.table column that contains a weight.

by

character vector of the columns to group by

Author(s)

Matthias Bannert, Gabriel Bucur

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# TODO: add new weight columns to BTS demo
# load library and dataset
library(panelaggregation)
data(btsdemo)
head(btsdemo)
# adapt the levels to positive, equal and negative
# in order to suit the naming defaults. other levels work too, 
# but you'd need to specify multipliers in computeBalance then
levels(btsdemo$question_1) <- c("pos","eq","neg")

# compute the weighted shares and display store in wide format 
# to get a basis for further steps
level1 <- computeShares(btsdemo,"question_1","weight", 
                        by = c("date_qtrly","group", "altGroup", "sClass"))

# compute balance, don't have to do much here, because
# (pos, eq, neg) is the default for the possible answers
level1_wbalance <- computeBalance(level1)

# Select a particular grouping combination and a timeseries that 
# should be extracted from the level 1 aggregation.
ts1 <- extractTimeSeries(level1_wbalance,
                         "date_qtrly",
                         list(group = "C", altGroup = "a", sClass = "S"),
                         freq = 4,
                         item = "balance",
                         variable = "question_1")
ts1
# Plot a standard R ts using the plot method for ts
plot(ts1, main = attributes(ts1)$ts_key)

# Add weight column to the aggregated results
# In order to join the tables, we need to know what weight to assign to each row.
# This is done by having via a common key, for example c('group', 'altGroup').
# In this example we would assign a different weight for each 
#   c('group', 'altGroup') combination (e.g. c('A', 'a')).
btsweight1 <- btsdemo[, list(weight = sum(weight)), by = 'group']
btsagg1 <- joinDataTables(level1_wbalance, btsweight1, 'group')

# Compute second level aggregation, this time on fewer columns and using a different set of weights.
level2_balance <- computeWeightedMeans(btsagg1, c('item_pos', 'item_eq', 'item_neg', 'balance'), 
                                       'weight', c("date_qtrly","group", "sClass"))

# Select a particular grouping combination and a timeseries that 
# should be extracted from the level 2 aggregation.
ts2 <- extractTimeSeries(level2_balance,
                         "date_qtrly",
                         list(group = "C", sClass = "S"),
                         freq = 4,
                         item = "balance",
                         variable = "question_1")
ts2
# Plot a standard R ts using the plot method for ts
plot(ts2, main = attributes(ts2)$ts_key)

# Add weight column to the aggregated results
# In order to join the tables, we need to know what weight to assign to each row.
# This is done by having via a common key, for example c('group', 'altGroup').
# In this example we would assign a different weight for each 
#   c('group', 'altGroup') combination (e.g. c('A', 'a')).
btsweight2 <- btsdemo[, list(weight = sum(weight)), by = 'sClass']
btsagg2 <- joinDataTables(level2_balance, btsweight2, 'sClass')

# Compute third level of aggregation, on the whole sector, using yet another set of weights.
level3_balance <- computeWeightedMeans(btsagg2, 'balance', 'weight', c("date_qtrly", "sClass"))

# Select a particular grouping combination and a timeseries that 
# should be extracted from the level 2 aggregation.
ts3 <- extractTimeSeries(level3_balance,
                         "date_qtrly",
                         list(sClass = "S"),
                         freq = 4,
                         item = "balance",
                         variable = "question_1")
ts3
# Plot a standard R ts using the plot method for ts
plot(ts3, main = attributes(ts3)$ts_key)

Example output

Loading required package: data.table
   uid year period weight question_1 question_2 question_3 question_4
1:   2 1997      3      1          2          2          2          1
2:   2 1997      1      1          1          2          3          1
3:   2 1995      4      1          2          1          3          1
4:   2 1998      4      1          3          2          3          3
5:   2 2001      3      1          1          3          2          2
6:   2 1999      4      1          2          3          1          2
   question_5 group altGroup sClass date_qtrly
1:          2     A        a      S 1997-07-01
2:          2     A        a      S 1997-01-01
3:          3     A        a      S 1995-10-01
4:          3     A        a      S 1998-10-01
5:          3     A        a      S 2001-07-01
6:          1     A        a      S 1999-10-01
              Qtr1          Qtr2          Qtr3          Qtr4
1995 -0.0545830462  0.0288077188 -0.3718939812  0.0199862164
1996 -0.5014472777  0.1983459683  0.3882531366 -0.3687112336
1997 -0.1538673652  0.1518952447 -0.2290833908 -0.0008270159
1998 -0.3101641153  0.0000000000  0.1481736733 -0.3601654032
1999  0.1447277739  0.1403170227 -0.2453818583 -0.0150241213
2000 -0.0880771881  0.0024810476 -0.1689869056  0.1634734666
2001 -0.1317711923  0.4347346657 -0.3554789800              
             Qtr1         Qtr2         Qtr3         Qtr4
1995  0.137233631  0.201760680 -0.273770077  0.134753045
1996 -0.044960029  0.022723515  0.025232794  0.159546288
1997 -0.302547004  0.254383504 -0.156156701  0.276429660
1998 -0.156377126 -0.083758159 -0.042750275 -0.008252982
1999  0.018357218 -0.019122172 -0.149692958 -0.061159112
2000  0.024826369  0.287670703  0.127653440  0.184513700
2001 -0.035004189  0.210614528 -0.272243093             
              Qtr1          Qtr2          Qtr3          Qtr4
1995  0.0896820503  0.0119279699 -0.0484929716  0.0598320672
1996  0.0296122264  0.0060300982 -0.0097891576  0.1270929141
1997 -0.0906111838  0.1376933061 -0.0157739068  0.0738305125
1998 -0.0591716880 -0.0974707069 -0.0115693631  0.0200627581
1999  0.0241380578  0.0936625440 -0.0158667069 -0.1037388731
2000 -0.0006253058  0.0797590485  0.0514504681  0.0830437685
2001  0.0692833987  0.0178078679 -0.0716521786              

panelaggregation documentation built on May 2, 2019, 2:14 p.m.