In CfSOtago/GREENGridEECA: Supports Analysis of NZ GREEN Grid data for EECA

knitr::opts_chunk$set(echo = FALSE) # by default turn off code echo

# Set start time ----
startTime <- proc.time()

# Local parameters ----
b2Kb <- 1024 #http://whatsabyte.com/P1/byteconverter.htm
b2Mb <- 1048576
plotLoc <- paste0(repoParams$repoLoc, "/docs/plots/") # where to put the plots (if any)

# Packages used in the report ----
rmdLibs <- c("ggplot2", # plots
          "kableExtra", # fancy tables
          "hms", # times
          "skimr", # for skim
          "tidyr", # for gather
          "dplyr", # for filter
          "readr"#for reading .csv
          )
GREENGridEECA::loadLibraries(rmdLibs)

# Local functions ----

Abstract

This report uses the GREEN Grid project research data to analyse a variety of residential household appliances and their contribution to peak demand under several scenarios. Although the data does not derive from a representative sample of households, we demonstrate that identifiable heating contributes ~ 21% to residential peak (17:00-21:00) demand in winter with Hot Water at ~ 17%, Lighting at 14% and Ovens at 7% while non-identified appliances contribute ~ 40%. These percentage contributions are generally similar at both regional network and sample co-incident peaks and across seasons although total power demand varies by season according to the appliance. Thus 'Others', heating and lighting are substantially lower in summer and whilst further research is clearly needed to unpack 'Other' demand, this suggests heating may be a major component.

Our results also show that simple mean (or median) values or single indicators such as average load factors mask considerable variation both within and between households. Overall we conclude that future work should focus on collecting data from a larger and representative sample of New Zealand households, ensuring that appliances are identified on circuits (especially electric heaters) and that both inter and intra-household heterogeneity should be adequately represented in analytic results.

\newpage

About

Citation

Report circulation:

Public – this report is intended for publication following EECA approval.

License

History

this report's edit history

Support

\newpage

Introduction

This report uses the GREEN Grid project [@stephenson_smart_2017] research data to analyse a variety of residential household appliances and their contribution to peak demand under several scenarios.

Data

As background, Table \@ref(tab:dataDesc) shows the mean half-hourly and mean total kW and kWh by season for the sample households in 2015. For this purpose the periods are defined as:

21:00 - 07:00 Off peak (night)
07:00 - 09:00 Peak (morning)
09:00 - 17:00 Off peak (day)
17:00 - 21:00 Peak (evening)

Note that the mean kWh values are substantially different from the medians indicating a skewed distribution.

#Conversion to energy Wh
powerDT[, energyWh := meanPowerW/2]
powerDT[, hour := lubridate::hour(r_dateTime_nz)]
powerDT[, ba_peak := ifelse(hour < 7, "Off peak (night)", NA)]
powerDT[, ba_peak := ifelse(hour >= 7 & hour < 9, "Peak (morning)", ba_peak)]
powerDT[, ba_peak := ifelse(hour >= 9 & hour < 17 , "Off peak (day)", ba_peak)]
powerDT[, ba_peak := ifelse(hour >= 17 & hour < 21 , "Peak (evening)", ba_peak)]
powerDT[, ba_peak := ifelse(hour >= 21, "Off peak (night)", ba_peak)]
test <- table(powerDT$ba_peak,powerDT$hour)

# test table
t <- powerDT[year == 2015 & eecaCircuit == "Total", .(sumkWh = sum(energyWh/1000, na.rm = TRUE),
                             meankWh = mean(energyWh/1000, na.rm = TRUE),
                             mediankWh = median(energyWh/1000, na.rm = TRUE), 
                             meankW = mean(meanPowerW/1000, na.rm = TRUE),
                             sdkW = sd(meanPowerW/1000, na.rm = TRUE),
                 nDays = uniqueN(date),
                 nHouseholds = uniqueN(linkID)), keyby = .(season,ba_peak)]
t[, meanTotalkWh := sumkWh/(nDays * nHouseholds)]
# NB in this definition 'peak' is only evening peak (not morning)
keepVars <- c("season", "ba_peak","nHouseholds", "meankW","sdkW","meankWh", "mediankWh", "meanTotalkWh")

new <- c("Season", "Peak", "N households", "Mean half-hourly kW", "s.d. (half-hourly kW)", "Mean half-hourly kWh", "Median half-hourly kWh","Mean total kWh for the period")

table_t <- t[, ..keepVars]

setnames(table_t, keepVars, new)

kableExtra::kable(table_t, digits = 2, caption = "Descriptive statistics for power data (overall household total demand/energy use)") %>%
  kable_styling()

The following table estimates the mean total kWh per period using a different method (i.e. directly from the half-hourly level data) as a cross-check.

t2 <- powerDT[year == 2015 & eecaCircuit == "Total", .(sumkWh = sum(energyWh/1000, na.rm = TRUE), # add up the total kWh
                  nDays = uniqueN(date)), keyby = .(linkID, season, ba_peak)] # for each hh, season and period
t2[, meanSumkWh := sumkWh/nDays]

table_t2 <- t2[, .(mean_SumkWh = mean(meanSumkWh)), keyby = .(season, ba_peak)]

kableExtra::kable(table_t2[!is.na(season)], digits = 2, caption = "Mean household total kWh per period and season (alternative method)") %>%
  kable_styling()

Figure \@ref(fig:plotkWh) shows the mean total kWh consumed per household by period and season. Note that this total reflects not only the power demand level but the number of hours in each period.

# plot sums at household level
plotDT <- powerDT[year == 2015 & eecaCircuit == "Total", .(sumkWh = sum(energyWh/1000, na.rm = TRUE),
                             meankW = mean(meanPowerW/1000, na.rm = TRUE),
                             sdkW = sd(meanPowerW/1000, na.rm = TRUE),
                 nDays = uniqueN(date),
                 nHouseholds = uniqueN(linkID)), keyby = .(season,ba_peak, linkID)]

plotDT[, meanTotalkWh := sumkWh/(nDays)]

p <- ggplot2::ggplot(plotDT, aes(y = meanTotalkWh, 
                                           x = ba_peak, 
                                           colour = season, 
                                           group = ba_peak
                                 )
                     ) +
  geom_boxplot() +
  labs(x = "Time of day",
         y = "Mean total kWh")

p +
  facet_grid(season ~ .) +
   theme(legend.position = 'bottom') +
  scale_color_discrete(name = "Season")

Figure \@ref(fig:plotkW) on the other hand shows the mean half-hourly kW demand per household by period and season and reflects the greater power demand in the evening and morning peak periods.

p <- ggplot2::ggplot(plotDT, aes(y = meankW, 
                                           x = ba_peak, 
                                           colour = season, 
                                           group = ba_peak
                                 )
                     ) +
  geom_boxplot() +
  labs(x = "Time of day",
         y = "Mean kW")

p +
  facet_grid(season ~ .) +
   theme(legend.position = 'bottom') +
  scale_color_discrete(name = "Season")

Peak Contribution

In this section we use the GREEN Grid residential sample to estimate the contribution of each appliance to peak demand. For this analysis we use demand data averaged over 30 minute intervals. We consider three different definitions of peak demand:

Sample winter evenings 17:00 - 21:00.
- Mean winter evening load:
  - find the average load across each circuit during the period
  - calculate the percentage contribution of each identified load type to this average peak load.
- Maximum winter evening load
  - find the maximum half-hour demand for each household during the defined peak period
  - calculate the percentage contribution of each separately identifiable load type (lighting, hot water, heat pump, oven) in that half hour.
Regional/network peak. In this case we:
- find the 100 maximum regional network (GXP) peak half hours
- calculate the mean contribution of each identifiable load to mean total load across all households in that region across these 100 half-hours
Sample co-incident peak. In this case we:
- find the half-hour across the sample which has the highest total combined demand
- calculate mean contribution of each identifiable load to mean total load across all households in this half-hour

To do this last part we classify the circuit labels according to the load types mentioned above (see Table \@ref(tab:categoriseCircuitsTable) in the Data Annex).

Winter (June - August) evenings 17:00 – 21:00

In this section we select the peak-period half hours for all households for the winter evening period and then calculate the mean demand for each dwelling over the period of peak demand.

Table \@ref(tab:averageAtPeak) shows the winter mean demand (W) at 'peak' for each dwelling and circuit in 2015. The complete data table across all houses is saved as “Extracted mean demand by dwelling and circuit at 'peak'.csv” and is supplied with the report.

powerDT[, year := lubridate::year(r_dateTime_nz)] # so we can do per-year analysis

subsetDT <- powerDT[season == "Winter" & 
                    year == 2015 &
                    obsHalfHour >= hms::as.hms("17:00:00") & 
                    obsHalfHour <= hms::as.hms("21:00:00")]



# extract _just_ the circuits which are (or contain):
# - total load
# - heat pumps
# - hot water
# - oven
# - lighting


#Calculating average demand over the period for each dwelling and circuit

dataBucketDT <- subsetDT[, .(avgPeakDemandW = mean(meanPowerW, na.rm = TRUE)), keyby =.(linkID, eecaCircuit)]


kableExtra::kable(head(dataBucketDT, 10), digits = 2,
                  caption = "Example data: Extracted mean demand by dwelling and circuit at 'peak'")  %>% # test result
  kable_styling()


data.table::fwrite(dataBucketDT, paste0(here::here(), "/data/output/Extracted mean demand by dwelling and circuit at 'peak'.csv"))

#This section aims to calculate the wattage contribution. We recycle previous results...

ContriW1DT <- dcast(dataBucketDT,#This tis the contribution DT for wattage and the first definition of peak demand
           linkID  ~ eecaCircuit, 
           fun = mean, # won't do anything as we should just have 1 value?
           value.var = "avgPeakDemandW")
ContriW1DT<-ContriW1DT[, c( "Total") := NULL]#We do not need this column

ContriW1DT <- setnames(ContriW1DT, old=c("Heat Pump or Heating","Hot water", "Lighting", "Oven", "Other"),
                        new=c("Heat Pump", "Hot Water", "Lighting", "Oven", "Other"))
ContriW1DT <- data.table::as.data.table(gather(ContriW1DT, eecaCircuit, avgPeakDemandW, `Heat Pump`:`Oven`, factor_key = TRUE))#This brings our data into a nice form for the plot
# except it depends on the order of the variables which is dodgy!

ContriW1DT <- ContriW1DT[, .(averageContri = mean(avgPeakDemandW, na.rm = TRUE), 
                              nobs=.N, sd=sd(avgPeakDemandW, na.rm = TRUE)), keyby=.(eecaCircuit)]


ContriW1DT$eecaCircuit <- factor(ContriW1DT$eecaCircuit, levels = c("Hot Water","Heat Pump",
                                                    "Lighting", "Oven", "Other"))#This puts the seasons in the right order in the plot

#Adding percentages

ContriW1DT <- ContriW1DT[, totalW := sum(averageContri)]
ContriW1DT <- ContriW1DT[, percentage := (averageContri/totalW)*100]

ContriW1DT$eecaCircuit <- factor(ContriW1DT$eecaCircuit, levels = c("Hot Water","Heat Pump",
                                                    "Lighting", "Oven", "Other"))#This puts the seasons in the right order in the plot

myPlot <- ggplot2::ggplot(ContriW1DT, aes(x= eecaCircuit, fill = eecaCircuit)) +
  geom_bar(aes(y = averageContri), stat = "identity")+
  theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_text(colour = "black"),
        axis.text.y = element_text(colour ="black"),
        legend.position = "none")+

    geom_text(aes(y = averageContri,
              label = sprintf("%0.2f",averageContri)),
           position = position_stack(vjust = .5))+

  #facet_grid(season ~ .) +
  #scale_y_continuous(breaks = c()) +
  #ggtitle("Contribution to peak demand (17:00-21:00) in 2015") +
  guides(fill=guide_legend(title="Circuit"))+
  labs(x='Circuit', y='Power in W/house') 
myPlot

myPlot <- ggplot2::ggplot(ContriW1DT, aes(x= "", y= percentage, fill = eecaCircuit)) +
  geom_bar(stat = "identity", width = 1, color ="white")+
  #labs(x='', y='percentage')+
   #facet_grid(facets=. ~season) +
  coord_polar("y", start = 0)+
  theme_void()+
 theme(#text = element_text(family = "Cambria"),
       legend.position = 'bottom')+
  guides(fill=guide_legend(title="Circuit"))+
  geom_text(aes(y = percentage,
              label = sprintf("%0.0f%%",percentage)),
           position = position_stack(vjust = .5),
           size = 5)

myPlot

We have then used this data to calculate the mean contribution of each identifiable load to peak demand (time period 17:00-21:00) across the sample for winter in 2015. This is shown in Figure \@ref(fig:wattageContributionPeak1Plot1) and Figure \@ref(fig:wattageContributionPeak1PlotPercent1) (as percentage contribution). On this measure, of the identifiable loads, Heat Pumps contributed the most to peak demand, followed by Hot Water, Lighting, and Oven. However overall 'Other' loads contributed the most at ~r round(ContriW1DT[eecaCircuit == "Other"]$percentage)% suggesting further research is needed to disaggregate this demand where possible (see Table \@ref(tab:categoriseCircuitsTable)) for circuits contributing to 'Other'.

Maximum load per season (W)

In this section we extract the half-hour which had the maximum total load during the evening period (17:00-21:00) for each household in season.

powerDT[, year := lubridate::year(r_dateTime_nz)] # so we can do per-year analysis

selectDT <- powerDT[circuit == 'imputedTotalDemand_circuitsToSum_v1.1' & 
                   obsHalfHour >= hms::as.hms("17:00:00") & 
                     obsHalfHour <= hms::as.hms("21:00:00"), .SD[which.max(meanPowerW)], # finds only the rows which are the max of the mean total load (XX should we be using the maxPowerW?? XX)
                   by = .(linkID, season, year)] # max value within household, season, year

# This is the time of maximum peak demand for each household

#kableExtra::kable(head(selectDT), caption = "Extracted peak load half-hours - first 6 rows")  %>% # test result
#  kable_styling()

# dataBucketDT <- data.table::data.table()
selectDT <- selectDT[,.(r_dateTime_nz, linkID)]
setkey(selectDT, r_dateTime_nz, linkID)
setkey(powerDT, r_dateTime_nz, linkID)
dataBucketDT <- powerDT[selectDT] # way quicker - selects the ones that match
#   
# for(n in 1:nrow(selectDT)){
#   # extract the rows which match the dateTime
#   row <- selectDT[n]
#   dt <- powerDT[linkID == row$linkID & # matches household
#                   r_dateTime_nz == row$r_dateTime_nz, # matches dateTime
#                 .(linkID, year, season, r_dateTime_nz, circuit, nObs, meanPowerW, minPowerW, maxPowerW, eecaCircuit, circuitLabel)] # selects just the basic columns we need
#   dataBucketDT <- rbind(dataBucketDT, dt)
# }



# extract _just_ the circuits which are (or contain):
# - total load
# - heat pumps
# - hot water
# - oven
# - lighting

Table \@ref(tab:winterPeaktestTable) shows the winter maximum load half-hours for each dwelling and year during the winter peak period as defined above. In this data extract, NaN indicates that no circuit data was obtained for this dwelling in this specific time period. We also calculate the proportionate contribution of each identifiable load to the total load. The complete data table across all houses is saved as “Extracted seasonal peak load half-hour circuits.csv” and is supplied with the report.

# we do not imput 0 for NaN/NA cells as this would mean we have measured
# 0 when we have not. We just don't know the value as there was
# no circuit of this type. If we include 0 we reduce the mean/% yet
# the true heating could have been in 'Other'

resAllSeasonsDT <- dcast(dataBucketDT,
           linkID + year + season + r_dateTime_nz ~ eecaCircuit, 
           fun = mean, # won't do anything as we should just have 1 value?
           fill = NA, # so we don't get NaN 
           value.var = "meanPowerW") # take 1 household and turn data round

# Now just calculate the % using the identified circuits?


resAllSeasonsDT <- resAllSeasonsDT[, pc_HW := `Hot water`/Total]
resAllSeasonsDT <- resAllSeasonsDT[, pc_HP := `Heat Pump or Heating`/Total]
resAllSeasonsDT <- resAllSeasonsDT[, pc_LI := Lighting/Total]
resAllSeasonsDT <- resAllSeasonsDT[, pc_OV := Oven/Total]
resAllSeasonsDT <- resAllSeasonsDT[, pc_Other := Other/Total]

kableExtra::kable(head(resAllSeasonsDT[season == "Winter"], 10), digits = 2,
                  caption = "Example data: Extracted winter peak load half-hour circuits (NA indicates no such circuit)")  %>% # test result
  kable_styling()

data.table::fwrite(resAllSeasonsDT, paste0(here::here(), "/data/output/Extracted seasonal peak load half-hour circuits.csv"))

summaryP1DT <- copy(resAllSeasonsDT) #This is the summary table to present to EECA for the first definition of peak demand

summaryP1DT<-summaryP1DT[, c("Heat Pump or Heating", "Hot water", "Lighting", "Oven", "Total", "Other") := NULL]

#summaryP1DT<-summaryP1DT[, averageContriHW := ]
summaryP1DT <- setnames(summaryP1DT, old=c("pc_HW","pc_HP", "pc_LI", "pc_OV", "pc_Other"),
                        new=c("Hot Water", "Heat Pump", "Lighting", "Oven", "Other"))

#summaryP1DT <- data.table::as.data.table(gather(summaryP1DT, eecaCircuit, meanPowerW, `Heat Pump or Heating`:`Oven`, factor_key = TRUE))
summaryP1DT <- data.table::as.data.table(gather(summaryP1DT, eecaLabelContri, contribution, `Hot Water`:`Oven`, factor_key = TRUE))#This brings our data into a nice form for the plot

summaryP1DT <- summaryP1DT[, .(averageContri = mean(contribution, na.rm = TRUE), 
                              nobs=.N, sd=sd(contribution, na.rm = TRUE)), keyby=.(eecaLabelContri, season, year)]

summaryP1DT <- summaryP1DT[, averageContri := averageContri*100 ]#Conversion to % for plot

summaryP1DT$season <- factor(summaryP1DT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot

#XX_eval =FALSE_XX

myPlot <- ggplot2::ggplot(summaryP1DT[season == "Winter" & year == 2015], aes(x= eecaLabelContri, fill = eecaLabelContri)) +
  geom_bar(aes(y = averageContri), stat = "identity")+
  theme(#text = element_text(family = "Cambria"), legend.position = "none",
        axis.text.x = element_text(colour = "black"),
        axis.text.y = element_text(colour = "black")) +

  #facet_grid(season ~ .) +
  scale_y_continuous(breaks = c(5, 10, 15, 20, 25, 30, 35)) +
  ggtitle("Contribution to peak demand (17:00-21:00) for winter in 2015") +
  labs(x='Circuit', y='Contribution in %') 



#XX_Into Annex??_XX

#XX_eval=FALSE_XX

#The stacks do not add up to 100 as we have averaged the contributions over all households. XXMention in the textXX

myPlot <- ggplot2::ggplot(summaryP1DT[year == 2015], aes(x= season, fill = eecaLabelContri)) +
  geom_bar(aes(y = averageContri), stat = "identity")+
  theme(#text = element_text(family = "Cambria"),
        axis.text.y = element_blank(),
        axis.text.x = element_text(colour = "black"),
        axis.ticks.y = element_blank())+

    geom_text(aes(y = averageContri,
              label = sprintf("%0.2f",averageContri)),
           position = position_stack(vjust = .5))+

  #facet_grid(season ~ .) +
  #scale_y_continuous(breaks = c()) +
  #ggtitle("Contribution to peak demand (17:00-21:00) in 2015") +
  guides(fill=guide_legend(title="Circuit"))+
  labs(x='Season', y='Contribution in %') 
myPlot
#Now do the same for wattage contribution...

#This section aims to calculate the wattage contribution. We recycle previous results...

ContriW1DT <- dcast(dataBucketDT,#This tis the contribution DT for wattage and the first definition of peak demand
           linkID + year + season + r_dateTime_nz ~ eecaCircuit, 
           fun = mean, # won't do anything as we should just have 1 value?
           fill = NA, # so we don't get NaN
           value.var = "meanPowerW")
ContriW1DT<-ContriW1DT[, c( "Total") := NULL]#We do not need this column

ContriW1DT <- setnames(ContriW1DT, old=c("Heat Pump or Heating","Hot water", "Lighting", "Oven", "Other"),
                        new=c("Heat Pump", "Hot Water", "Lighting", "Oven", "Other"))
ContriW1DT <- data.table::as.data.table(gather(ContriW1DT, eecaLabelContri, contribution, `Heat Pump`:`Oven`, factor_key = TRUE))#This brings our data into a nice form for the plot

ContriW1DT <- ContriW1DT[, .(averageContri = mean(contribution, na.rm = TRUE), 
                              nobs=.N, sd=sd(contribution, na.rm = TRUE)), keyby=.(eecaLabelContri, season, year)]

ContriW1DT$season <- factor(ContriW1DT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot
ContriW1DT$eecaLabelContri <- factor(ContriW1DT$eecaLabelContri, levels = c("Hot Water","Heat Pump",
                                                    "Lighting", "Oven", "Other"))#This puts the seasons in the right order in the plot

#Adding percentages

ContriW1DT <- ContriW1DT[year==2015]

ContriW1DT <- ContriW1DT[, totalW := sum(averageContri), keyby =.(season)]
ContriW1DT <- ContriW1DT[, percentage := (averageContri/totalW)*100]

ContriW1DT$season <- factor(ContriW1DT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot
ContriW1DT$eecaLabelContri <- factor(ContriW1DT$eecaLabelContri, levels = c("Hot Water","Heat Pump",
                                                    "Lighting", "Oven", "Other"))#This puts the seasons in the right order in the plot

ConsLoadFactorSupplyDT <- copy(ContriW1DT)

We have then used this data to calculate the mean contribution of each identifiable load to peak demand (time period 17:00-21:00) for each season and year across the sample.

Results for 2015 are shown in Figure \@ref(fig:wattageContributionPeak1Plot) and Figure \@ref(fig:wattageContributionPeak1PlotPercent) (as percentage contribution).

myPlot <- ggplot2::ggplot(ContriW1DT[year == 2015], aes(x= season, fill = eecaLabelContri)) +
  geom_bar(aes(y = averageContri), stat = "identity")+
  theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_text(colour = "black"),
        axis.text.y = element_text(colour ="black"))+

    geom_text(aes(y = averageContri,
              label = sprintf("%0.2f",averageContri)),
           position = position_stack(vjust = .5))+

  #facet_grid(season ~ .) +
  #scale_y_continuous(breaks = c()) +
  #ggtitle("Contribution to peak demand (17:00-21:00) in 2015") +
  guides(fill=guide_legend(title="Circuit"))+
  labs(x='Season', y='Power in W/house') 
myPlot

myPlot <- ggplot2::ggplot(ContriW1DT[year == 2015], aes(x= "", y= percentage, fill = eecaLabelContri)) +
  geom_bar(stat = "identity", width = 1, color ="white")+
  #labs(x='', y='percentage')+
   facet_grid(facets=. ~season) +
  coord_polar("y", start = 0)+
  theme_void()+
 theme(#text = element_text(family = "Cambria"),
       legend.position = 'bottom')+
  guides(fill=guide_legend(title="Circuit"))+
  geom_text(aes(y = percentage,
              label = sprintf("%0.0f%%",percentage)),
           position = position_stack(vjust = .5),
           size = 3)

myPlot

Figure \@ref(fig:wattageContributionPeak1Plot) shows the seasonal variation in power demand especially for heat pumps and lighting but also shows the relative constancy of hot water’s contribution across the seasons. As a consequence the percentage contribution of hot water slightly increases in summer whereas the contribution of heat pumps decreases (see Figure \@ref(fig:wattageContributionPeak1PlotPercent)). Hot water is identified as the appliance that is contributing the most to peak demand in this case. It should be noted that these mean values will mask considerable day to day and inter/intra-household variation. Note also that ‘Other’ almost certainly contains some forms of heating as it is much lower in summer (see the coding definitions in Table \@ref(tab:categoriseCircuitsTable)).

Region/network coincident peak

In this section we focus on the contribution of these identifiable loads to regional peak demand. As 22 of the GREENGrid houses are based in Taranaki and 20 houses in Hawke's Bay with further 2 pilot dwellings in Otago, we focus on the first two regions by separating the sample into the Hawke’s Bay and Taranaki sub-samples.

Within the Taranaki sample, 31% of the houses had heat pumps installed, 45% hot water cylinders, and 27% other heat sources. In contrast, the Hawke's Bay sub-sample 90% had heat pumps, 90% hot water cylinder, and 36% other heat sources. The very small sample sizes and recruiting approaches limit the representativeness of the sample.

Footnote: For further details see https://cfsotago.github.io/GREENGridData/householdAttributeProcessingReport_v1.0.html#51_main_heat_source and also ref to Part C

To determine the timing of the matching regional peaks, data on regional demand for the Hawke’s Bay and Taranaki Grid Exit Points (GXPs - see Table \@ref(tab:whichGXPs)) was obtained from the Electricity Authority and for each GXP, the timestamps of the 100 half hours with the highest demand were identified. We then extracted the power demand of each dwelling in each of the two regions during these 100 half hours for analysis.

gxpsToUse <- data.table::fread(paste0(here::here(), "/data/input/gxp-lookup.csv"))
gxpGeo <- data.table::fread(paste0(here::here(), "/data/input/gxpGeolookup.csv")) # if needed?

kableExtra::kable(gxpsToUse[, .(node, `region name`)], caption = "GXPs used")

regpeakDT <- copy(powerDT)

regpeakDT <- regpeakDT[, circuitLabel := NULL]
regpeakDT <- regpeakDT[, peak := NULL]
regpeakDT <- regpeakDT[, month := NULL]
regpeakDT <- regpeakDT[, r_dateTimeHalfHour := NULL]
regpeakDT <- regpeakDT[, sdPowerW := NULL]
regpeakDT <- regpeakDT[, minPowerW := NULL]
regpeakDT <- regpeakDT[, circuitID := NULL]
regpeakDT <- regpeakDT[, circuit := NULL]
regpeakDT <- regpeakDT[, maxPowerW := NULL]
regpeakDT <- regpeakDT[, date := NULL]
regpeakDT <- regpeakDT[, obsHalfHour := NULL]#We do not need these, it slows down our computer...

setkey(hhSurveyDT, linkID)
setkey(regpeakDT, linkID)

# Location = region in the survey data
regpeakDT <- hhSurveyDT[,.(linkID, Location)][regpeakDT] # attach Location to linkID in power data

regpeakDT <- regpeakDT[, region := Location] #Creating new variable

# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_08", "Taranaki", region)]#Match regions with IDs
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_26", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_21", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_13", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_24", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_27", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_19", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_11", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_10", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_22", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_06", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_14", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_23", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_07", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_25", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_12", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_16", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_20", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_09", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_15", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_17", "Taranaki", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_18", "Taranaki", region)]
# 
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_43", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_34", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_30", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_28", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_38", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_37", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_41", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_42", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_36", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_45", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_40", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_32", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_33", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_39", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_44", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_35", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_46", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_29", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_31", "Hawkes Bay", region)]
# regpeakDT <- regpeakDT[, region := ifelse(`linkID`== "rf_47", "Hawkes Bay", region)]


ind <- with(regpeakDT, region == is.na(regpeakDT$region))#Now let's remove the households that do not match the two regions
regpeakDT <- regpeakDT[!ind, ]#Removing households that do not match and have NA as value

taranakiDT <- copy(regpeakDT)
taranakiDT <- taranakiDT[region=="Taranaki"]#taranaki only
hawkesDT <- copy(regpeakDT)
hawkesDT <- hawkesDT[region=="Hawkes Bay"]#hawkes bay only

##Data import
# GXPtaranakiDT <- data.table::as.data.table(
#  readr::read_csv(paste0(here::here(), "/data/output/taranakiGxpTop100DT.csv",
#                        col_types = cols(
#                        region = col_character(),
#                        regionName = col_character(),
#                        rDateTime = col_datetime(format = ""),
#                        weekdays = col_character(),
#                        peakPeriod = col_character(),
#                        sumkWh = col_double(),
#                        nObs = col_double(),
#                        nRegions = col_double(),
#                        nPOCs = col_double(),
#                        nNetworks = col_double(),
#                        nGenTypes = col_double(),
#                        month = col_character() # <- crucial otherwise readr infers integers for some reason
#                      )
#                  )
# )

GXPtaranakiDT <- data.table::fread(paste0(here::here(), "/data/output/taranakiGxpTop100DT.csv"))


# set to NZ time from UTC
GXPtaranakiDT <- GXPtaranakiDT[, r_dateTime_nz := lubridate::as_datetime(rDateTime, 
                                               tz = "Pacific/Auckland")] # this will be UTC unless you set this

#Define Winter/else
GXPtaranakiDT <- GXPtaranakiDT[, date := lubridate::date(r_dateTime_nz)]
GXPtaranakiDT <- GXPtaranakiDT[, obsHalfHour := hms::as.hms(r_dateTime_nz)]
#powerDT[, obsHalfHour := format(ymd_hms(r_dateTimeHalfHour), "%H:%M:%S")]
GXPtaranakiDT <- GXPtaranakiDT[, month := lubridate::month(r_dateTime_nz)]


GXPtaranakiDT <- GXPtaranakiDT[month == 12 | month == 1 | month == 2, season := "Summer"]
GXPtaranakiDT <- GXPtaranakiDT[month == 3 | month == 4 | month == 5, season := "Autumn"]
GXPtaranakiDT <- GXPtaranakiDT[month == 6 | month == 7 | month == 8, season := "Winter"]
GXPtaranakiDT <- GXPtaranakiDT[month == 9 | month == 10 | month == 11, season := "Spring"]

##Data import
# GXPHawkesDT <- data.table::as.data.table(
#  readr::read_csv(paste0(here::here(), "/data/output/hawkesBayGxpTop100DT.csv",
#                        col_types = cols(
#                        region = col_character(),
#                        regionName = col_character(),
#                        rDateTime = col_datetime(format = ""),
#                        weekdays = col_character(),
#                        peakPeriod = col_character(),
#                        sumkWh = col_double(),
#                        nObs = col_double(),
#                        nRegions = col_double(),
#                        nPOCs = col_double(),
#                        nNetworks = col_double(),
#                        nGenTypes = col_double(),
#                        month = col_character() # <- crucial otherwise readr infers integers for some reason
#                      )
#                  )
# )

GXPHawkesDT <- data.table::fread(paste0(here::here(), "/data/output/hawkesBayGxpTop100DT.csv"))
# set to NZ time from UTC
GXPHawkesDT <- GXPHawkesDT[, r_dateTime_nz := lubridate::as_datetime(rDateTime, 
                                               tz = "Pacific/Auckland")] # this will be UTC unless you set this

#Define Winter/else
GXPHawkesDT <- GXPHawkesDT[, date := lubridate::date(r_dateTime_nz)]
GXPHawkesDT <- GXPHawkesDT[, obsHalfHour := hms::as.hms(r_dateTime_nz)]
#powerDT[, obsHalfHour := format(ymd_hms(r_dateTimeHalfHour), "%H:%M:%S")]
GXPHawkesDT <- GXPHawkesDT[, month := lubridate::month(r_dateTime_nz)]


GXPHawkesDT <- GXPHawkesDT[month == 12 | month == 1 | month == 2, season := "Summer"]
GXPHawkesDT <- GXPHawkesDT[month == 3 | month == 4 | month == 5, season := "Autumn"]
GXPHawkesDT <- GXPHawkesDT[month == 6 | month == 7 | month == 8, season := "Winter"]
GXPHawkesDT <- GXPHawkesDT[month == 9 | month == 10 | month == 11, season := "Spring"]

#These loops take one night to process....
#I set eval=FALSE to prevent it from running
#They look at the GXP rDateTime and then find this time in the region DT and extract the whole row into a new DT

#Hawke's Bay
# we have the peak half-hours in GXPHawkesDT
filterDT <- GXPHawkesDT[, .(r_dateTime_nz)]
setkey(filterDT, r_dateTime_nz)
setkey(powerDT, r_dateTime_nz)
extractHawkesDT <- powerDT[filterDT] # this will return just the dateTimes that match the filter simple and very fast :-)

#extractHawkesDT <- NULL
#dateTimelistHawkes <- hawkesDT[["r_dateTime_nz"]]
#dateTimelist <- GXPHawkesDT[["r_dateTime_nz"]]
#i<-0
#for (time in c(dateTimelist)){
#  for (newtime in c(dateTimelistHawkes)){
#    i<-i+1
#    if (toString(time) == toString(newtime)){
#      #test = do.call("rbind",(hawkesDT[i,]))
#    extractHawkesDT = rbind(extractHawkesDT, data.table(hawkesDT[i,]))
#     # print(c(hawkesDT[i,]))
#    }
#  }
#  i <- 0
#}

#Taranaki
# we have the peak half-hours in GXPTaranakiDT
filterDT <- GXPtaranakiDT[, .(r_dateTime_nz)]
setkey(filterDT, r_dateTime_nz)
setkey(powerDT, r_dateTime_nz)
extractTaranakiDT <- powerDT[filterDT]
#extractTaranakiDT <- NULL
#dateTimelistTaranaki <- taranakiDT[["r_dateTime_nz"]]
#dateTimelist <- GXPtaranakiDT[["r_dateTime_nz"]]
#i<-0
#for (time in c(dateTimelist)){
#  for (newtime in c(dateTimelistTaranaki)){
#    i<-i+1
#    if (toString(time) == toString(newtime)){
#      #test = do.call("rbind",(hawkesDT[i,]))
#      extractTaranakiDT = rbind(extractTaranakiDT, data.table(taranakiDT[i,]))
#     # print(c(hawkesDT[i,]))
#    }
#  }
#  i <- 0
#}

#Maybe it is a good idea to export these two tables as a csv in case we want to re-run analysis without waiting for ages.

#write.csv(extractHawkesDT, file = "extractHawkes.csv", row.names = FALSE) #They are stored in the Dropbox data folder

#write.csv(extractTaranakiDT, file = "extractTaranaki.csv", row.names = FALSE)

# no longer needed

#The loops above do not allow us to do doReport() in a reasonable time. I therefore decided to export the result and import it rather than calculating it again in the loops.

# #Loading Hawkes Bay extract data
# extractHawkesDT <- data.table::as.data.table(
#  readr::read_csv(paste0(here::here(), "/data/output/extractHawkes.csv"),
#                        col_types = cols(
#                        linkID = col_character(),
#                        nObs = col_double(),
#                        r_dateTime_nz = col_datetime(format = ""),
#                        meanPowerW = col_double(),
#                        season = col_character(),
#                        year = col_character(),
#                        eecaCircuit = col_character(),
#                        region = col_character() # <- crucial otherwise readr infers integers for some reason
#                      )
#                  )
# )
# 
# #Loading Taranaki extract data
# 
# extractTaranakiDT <- data.table::as.data.table(
#  readr::read_csv(paste0(here::here(), "/data/output/extractTaranaki.csv"),
#                        col_types = cols(
#                        linkID = col_character(),
#                        nObs = col_double(),
#                        r_dateTime_nz = col_datetime(format = ""),
#                        meanPowerW = col_double(),
#                        season = col_character(),
#                        year = col_character(),
#                        eecaCircuit = col_character(),
#                        region = col_character() # <- crucial otherwise readr infers integers for some reason
#                      )
#                  )
# )
# 
#

Hawke's Bay

Table \@ref(tab:RegionalAnalysisHawkesContribution) presents example data of the extracted peak half-hours by circuit for 2015 for a Hawke’s Bay dwellings. The complete data table is saved as “Extracted peak load half-hour circuits for Hawke’s Bay.csv” and is supplied with the report.

resAllSeasonsDT <- copy(extractHawkesDT)

resAllSeasonsDT <- dcast(resAllSeasonsDT,
           linkID + year + season + r_dateTime_nz ~ eecaCircuit, 
           fun = mean, # won't do anything as we should just have 1 value?
           fill = NA, # so we don't get NaN
           value.var = "meanPowerW") # take 1 household and turn data round

# Now just calculate the % using the identified circuits?


resAllSeasonsDT <- resAllSeasonsDT[, pc_HW := `Hot water`/Total]
resAllSeasonsDT <- resAllSeasonsDT[, pc_HP := `Heat Pump or Heating`/Total]
resAllSeasonsDT <- resAllSeasonsDT[, pc_LI := `Lighting`/Total]
resAllSeasonsDT <- resAllSeasonsDT[, pc_OV := `Oven`/Total]
#resAllSeasonsDT <- resAllSeasonsDT[, pc_Other := 1-((`Hot water` + `Heat Pump or Heating` + `Lighting` + `Oven`)/Total)]
resAllSeasonsDT <- resAllSeasonsDT[, pc_Other := `Other`/Total]

copiedtable <- copy(resAllSeasonsDT)
copiedtable <- copiedtable[, c( "pc_HW") := NULL]#We do not need this column
copiedtable <- copiedtable[, c( "pc_HP") := NULL]#We do not need this column
copiedtable <- copiedtable[, c( "pc_LI") := NULL]#We do not need this column
copiedtable <- copiedtable[, c( "pc_OV") := NULL]#We do not need this column
copiedtable <- copiedtable[, c( "pc_Other") := NULL]#We do not need this column





kableExtra::kable(head(copiedtable[year == "2015"], 10), digits = 2,
                  caption = "Example data: Extracted peak load half-hour circuits for Hawke's Bay (NaN indicates no such circuit)")  %>% # test result
  kable_styling()


WriteDT <- data.table::fwrite(copiedtable, paste0(here::here(), "/data/output/Extracted peak load half-hour circuits for Hawke's Bay.csv"))

#Eval=False, we do not need this 

summaryP1DT <- copy(resAllSeasonsDT) #This is the summary table to present to EECA for the first definition of peak demand

summaryP1DT<-summaryP1DT[, c("Heat Pump or Heating", "Hot water", "Lighting", "Oven", "Total", "Other") := NULL]

#summaryP1DT<-summaryP1DT[, averageContriHW := ]
summaryP1DT <- setnames(summaryP1DT, old=c("pc_HW","pc_HP", "pc_LI", "pc_OV", "pc_Other"),
                        new=c("Hot Water", "Heat Pump", "Lighting", "Oven", "Other"))

#summaryP1DT <- data.table::as.data.table(gather(summaryP1DT, eecaCircuit, meanPowerW, `Heat Pump or Heating`:`Oven`, factor_key = TRUE))
summaryP1DT <- data.table::as.data.table(gather(summaryP1DT, eecaLabelContri, contribution, `Hot Water`:`Oven`, factor_key = TRUE))#This brings our data into a nice form for the plot

summaryP1DT <- summaryP1DT[, .(averageContri = mean(contribution, na.rm = TRUE), 
                              nobs=.N, sd=sd(contribution, na.rm = TRUE)), keyby=.(eecaLabelContri, season, year)]

summaryP1DT <- summaryP1DT[, averageContri := averageContri*100]#Conversion to % for plot

summaryP1DT$season <- factor(summaryP1DT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot

#Eval=false, we use the pie instead

#The stacks do not add up to 100 as we have averaged the contributions over all households. XXMention in the textXX XXDue to the fact that all top100 half hours were in winter we only have winter in the plotXX

myPlot <- ggplot2::ggplot(summaryP1DT[year == 2015], aes(x= season, fill = eecaLabelContri)) +
  geom_bar(aes(y = averageContri), stat = "identity")+
  theme(#text = element_text(family = "Cambria"),
        axis.text.y = element_blank(),
        axis.text.x = element_blank(),
        axis.ticks.y = element_blank())+

    geom_text(aes(y = averageContri,
              label = sprintf("%0.2f",averageContri)),
           position = position_stack(vjust = .5))+

  #facet_grid(season ~ .) +
  #scale_y_continuous(breaks = c()) +
  #ggtitle("Contribution to peak demand (17:00-21:00) in 2015") +
  guides(fill=guide_legend(title="Circuit"))+
  labs(x='Season', y='Contribution in %') 
myPlot
#Now do the same for wattage contribution...

#This section aims to calculate the wattage contribution. We recycle previous results...
ContriW1DT <- copy(extractHawkesDT)

ContriW1DT <- dcast(ContriW1DT,#This tis the contribution DT for wattage and the first definition of peak demand
           linkID + year + season + r_dateTime_nz ~ eecaCircuit, 
           fun = mean, # won't do anything as we should just have 1 value?
           fill = NA, # so we don't get NaN
           value.var = "meanPowerW")
ContriW1DT<-ContriW1DT[, c( "Total") := NULL]#We do not need this column

ContriW1DT <- setnames(ContriW1DT, old=c("Heat Pump or Heating","Hot water", "Lighting", "Oven", "Other"),
                        new=c("Heat Pump", "Hot Water", "Lighting", "Oven", "Other"))
ContriW1DT <- data.table::as.data.table(gather(ContriW1DT, eecaLabelContri, contribution, `Heat Pump`:`Oven`, factor_key = TRUE))#This brings our data into a nice form for the plot

ContriW1DT <- ContriW1DT[, .(averageContri = mean(contribution, na.rm = TRUE), 
                              nobs=.N, sd=sd(contribution, na.rm = TRUE)), keyby=.(eecaLabelContri, season, year)]

ContriW1DT$season <- factor(ContriW1DT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot
ContriW1DT$eecaLabelContri <- factor(ContriW1DT$eecaLabelContri, levels = c("Hot Water","Heat Pump",
                                                    "Lighting", "Oven", "Other"))#This puts the seasons in the right order in the plot


#Adding percentages

ContriW1DT <- ContriW1DT[year==2015]

ContriW1DT <- ContriW1DT[, totalW := sum(averageContri), keyby =.(season)]
ContriW1DT <- ContriW1DT[, percentage := (averageContri/totalW)*100]

ContriW1DT$season <- factor(ContriW1DT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot
ContriW1DT$eecaLabelContri <- factor(ContriW1DT$eecaLabelContri, levels = c("Hot Water","Heat Pump",
                                                    "Lighting", "Oven", "Other"))#This puts the seasons in the right order in the plot

The mean contribution of each identifiable load to the Hawke’s Bay regional peak demand (100 half hours with the highest demand at Hawke’s Bay GXP) across the sample is shown in Figure \@ref(fig:wattageContributionHawkesPlot). Hot Water and Heat Pumps contribute a mean of 682 Watts and 672 Watts respectively but ‘Other’ (which will include some heating see Table \@ref(tab:categoriseCircuitsTable)) contributes 792 W. Figure \@ref(fig:wattageContributionHawkesPlotPercentage) shows the percentage contribution of each identifiable load to the regional peak. Heat Pumps and Hot Water contributed almost equally to regional network peak demand in Hawke’s Bay (27% and 28%), followed by Lighting (10%) and Oven (3%). ‘Other’ demand contributed 32%. It should be noted that the inherent bias in the sample, and the very small sample size in each region means that these % results are almost certainly not representative of a larger population.

myPlot <- ggplot2::ggplot(ContriW1DT[year == 2015], aes(x= season, fill = eecaLabelContri)) +
  geom_bar(aes(y = averageContri), stat = "identity")+
  theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_blank(),
        axis.ticks.x = element_blank(),
        axis.text.y = element_text(colour ="black"))+


    geom_text(aes(y = averageContri,
              label = sprintf("%0.2f",averageContri)),
           position = position_stack(vjust = .5))+

  #facet_grid(season ~ .) +
  #scale_y_continuous(breaks = c()) +
  #ggtitle("Contribution to peak demand (17:00-21:00) in 2015") +
  guides(fill=guide_legend(title="Circuit"))+
  labs(x='', y='Power in W/house') 
myPlot

myPlot <- ggplot2::ggplot(ContriW1DT[year == 2015], aes(x= "", y= percentage, fill = eecaLabelContri)) +
  geom_bar(stat = "identity", width = 1, color ="white")+
  #labs(x='', y='percentage')+
   #facet_grid(facets=. ~season) +
  coord_polar("y", start = 0)+
  theme_void()+
 theme(#text = element_text(family = "Cambria"),
       legend.position = 'bottom',
        axis.text.y = element_blank())+
  guides(fill=guide_legend(title="Circuit"))+
  geom_text(aes(y = percentage,
              label = sprintf("%0.0f%%",percentage)),
           position = position_stack(vjust = .5),
           size = 4)

myPlot

Taranaki

Table \@ref(tab:RegionalAnalysisTaranakiContribution) presents example data of the extracted peak half-hours by circuit for 2015 for the Taranaki houses. The complete data table is saved as “Extracted peak load half-hour circuits for Taranaki.csv” and is supplied with the report.

resAllSeasonsDT <- copy(extractTaranakiDT)

resAllSeasonsDT <- dcast(resAllSeasonsDT,
           linkID + year + season + r_dateTime_nz ~ eecaCircuit, 
           fun = mean, # won't do anything as we should just have 1 value?
           fill = NA, # so we don't get NaN
           value.var = "meanPowerW") # take 1 household and turn data round

# Now just calculate the % using the identified circuits?


resAllSeasonsDT <- resAllSeasonsDT[, pc_HW := `Hot water`/Total]
resAllSeasonsDT <- resAllSeasonsDT[, pc_HP := `Heat Pump or Heating`/Total]
resAllSeasonsDT <- resAllSeasonsDT[, pc_LI := `Lighting`/Total]
resAllSeasonsDT <- resAllSeasonsDT[, pc_OV := `Oven`/Total]
#resAllSeasonsDT <- resAllSeasonsDT[, pc_Other := 1-((`Hot water` + `Heat Pump or Heating` + `Lighting` + `Oven`)/Total)]
resAllSeasonsDT <- resAllSeasonsDT[, pc_Other := `Other`/Total]


copiedtable <- copy(resAllSeasonsDT)
copiedtable <- copiedtable[, c( "pc_HW") := NULL]#We do not need this column
copiedtable <- copiedtable[, c( "pc_HP") := NULL]#We do not need this column
copiedtable <- copiedtable[, c( "pc_LI") := NULL]#We do not need this column
copiedtable <- copiedtable[, c( "pc_OV") := NULL]#We do not need this column
copiedtable <- copiedtable[, c( "pc_Other") := NULL]#We do not need this column

kableExtra::kable(head(copiedtable[year == "2015"], 10), digits = 2,
                  caption = "Example data: Extracted peak load half-hour circuits (NaN indicates no such circuit)")  %>% # test result
  kable_styling()





WriteDT <- data.table::fwrite(copiedtable, paste0(here::here(), "/data/output/Extracted peak load half-hour circuits for Taranaki.csv"))

#Eval=False, not needed
summaryP1DT <- copy(resAllSeasonsDT) #This is the summary table to present to EECA for the first definition of peak demand

summaryP1DT<-summaryP1DT[, c("Heat Pump or Heating", "Hot water", "Lighting", "Oven", "Total", "Other") := NULL]

#summaryP1DT<-summaryP1DT[, averageContriHW := ]
summaryP1DT <- setnames(summaryP1DT, old=c("pc_HW","pc_HP", "pc_LI", "pc_OV", "pc_Other"),
                        new=c("Hot Water", "Heat Pump", "Lighting", "Oven", "Other"))

#summaryP1DT <- data.table::as.data.table(gather(summaryP1DT, eecaCircuit, meanPowerW, `Heat Pump or Heating`:`Oven`, factor_key = TRUE))
summaryP1DT <- data.table::as.data.table(gather(summaryP1DT, eecaLabelContri, contribution, `Hot Water`:`Oven`, factor_key = TRUE))#This brings our data into a nice form for the plot

summaryP1DT <- summaryP1DT[, .(averageContri = mean(contribution, na.rm = TRUE), 
                              nobs=.N, sd=sd(contribution, na.rm = TRUE)), keyby=.(eecaLabelContri, season, year)]

summaryP1DT <- summaryP1DT[, averageContri := averageContri*100]#Conversion to % for plot

summaryP1DT$season <- factor(summaryP1DT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot

#Adding percentages

summaryP1DT <- summaryP1DT[year==2015]

summaryP1DT <- summaryP1DT[, totalW := sum(averageContri), keyby =.(season)]
summaryP1DT <- summaryP1DT[, percentage := (averageContri/totalW)*100]

summaryP1DT$season <- factor(summaryP1DT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot
summaryP1DT$eecaLabelContri <- factor(summaryP1DT$eecaLabelContri, levels = c("Hot Water","Heat Pump",
                                                    "Lighting", "Oven", "Other"))#This puts the seasons in the right order in the plot

#Eval=False, we use the pie chart instead

#The stacks do not add up to 100 as we have averaged the contributions over all households. XXMention in the textXX XXDue to the fact that all top100 half hours were in winter we only have winter in the plotXX

myPlot <- ggplot2::ggplot(summaryP1DT[year == 2015], aes(x= season, fill = eecaLabelContri)) +
  geom_bar(aes(y = averageContri), stat = "identity")+
  theme(#text = element_text(family = "Cambria"),
        axis.text.y = element_blank(),
        axis.text.x = element_text(colour = "black"),
        axis.ticks.y = element_blank())+

    geom_text(aes(y = averageContri,
              label = sprintf("%0.2f",averageContri)),
           position = position_stack(vjust = .5))+

  #facet_grid(season ~ .) +
  #scale_y_continuous(breaks = c()) +
  #ggtitle("Contribution to peak demand (17:00-21:00) in 2015") +
  guides(fill=guide_legend(title="Circuit"))+
  labs(x='Season', y='Contribution in %') 
myPlot
#Now do the same for wattage contribution...

#This section aims to calculate the wattage contribution. We recycle previous results...
ContriW1DT <- copy(extractTaranakiDT)

ContriW1DT <- dcast(ContriW1DT,#This tis the contribution DT for wattage and the first definition of peak demand
           linkID + year + season + r_dateTime_nz ~ eecaCircuit, 
           fun = mean, # won't do anything as we should just have 1 value?
           fill = NA, # so we don't get NaN
           value.var = "meanPowerW")
ContriW1DT<-ContriW1DT[, c( "Total") := NULL]#We do not need this column

ContriW1DT <- setnames(ContriW1DT, old=c("Heat Pump or Heating","Hot water", "Lighting", "Oven", "Other"),
                        new=c("Heat Pump", "Hot Water", "Lighting", "Oven", "Other"))
ContriW1DT <- data.table::as.data.table(gather(ContriW1DT, eecaLabelContri, contribution, `Heat Pump`:`Oven`, factor_key = TRUE))#This brings our data into a nice form for the plot

ContriW1DT <- ContriW1DT[, .(averageContri = mean(contribution, na.rm = TRUE), 
                              nobs=.N, sd=sd(contribution, na.rm = TRUE)), keyby=.(eecaLabelContri, season, year)]

ContriW1DT$season <- factor(ContriW1DT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot
ContriW1DT$eecaLabelContri <- factor(ContriW1DT$eecaLabelContri, levels = c("Hot Water","Heat Pump",
                                                    "Lighting", "Oven", "Other"))#This puts the seasons in the right order in the plot



#Adding percentages

ContriW1DT <- ContriW1DT[year==2015]

ContriW1DT <- ContriW1DT[, totalW := sum(averageContri), keyby =.(season)]
ContriW1DT <- ContriW1DT[, percentage := (averageContri/totalW)*100]

ContriW1DT$season <- factor(ContriW1DT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot
ContriW1DT$eecaLabelContri <- factor(ContriW1DT$eecaLabelContri, levels = c("Hot Water","Heat Pump",
                                                    "Lighting", "Oven", "Other"))#This puts the seasons in the right order in the plot

The mean contribution by identifiable load to the regional peak across the sample is presented in Figure \@ref(fig:wattageContributionTaranakiPlot). Figure \@ref(fig:wattageContributionTaranakiPlotPercentage) shows the percentage contribution to peak demand. Lighting contributes the most to peak demand with 25%, followed by Heat Pumps (23%), Hot Water (14%), and Oven (8%). ‘Other’ loads contribute 30% to peak demand in Taranaki.

myPlot <- ggplot2::ggplot(ContriW1DT[year == 2015], aes(x= season, fill = eecaLabelContri)) +
  geom_bar(aes(y = averageContri), stat = "identity")+
    theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_blank(),
        axis.ticks.x = element_blank(),
        axis.text.y = element_text(colour ="black"))+

    geom_text(aes(y = averageContri,
              label = sprintf("%0.2f",averageContri)),
           position = position_stack(vjust = .5))+

  #facet_grid(season ~ .) +
  #scale_y_continuous(breaks = c()) +
  #ggtitle("Contribution to peak demand (17:00-21:00) in 2015") +
  guides(fill=guide_legend(title="Circuit"))+
  labs(x='', y='Power in W/house') 
myPlot

myPlot <- ggplot2::ggplot(ContriW1DT[year == 2015], aes(x= "", y= percentage, fill = eecaLabelContri)) +
  geom_bar(stat = "identity", width = 1, color ="white")+
  #labs(x='', y='percentage')+
   #facet_grid(facets=. ~season) +
  coord_polar("y", start = 0)+
  theme_void()+
 theme(#text = element_text(family = "Cambria"),
       legend.position = 'bottom')+
  guides(fill=guide_legend(title="Circuit"))+
  geom_text(aes(y = percentage,
              label = sprintf("%0.0f%%",percentage)),
           position = position_stack(vjust = .5),
           size = 4)

myPlot

Sample coincident peak

In this section the sum of half-hourly demand over all dwellings in the GREENGrid sample is used to define peak demand. We define the peak as the 20 half hours with the highest total half-hourly demand summed across all dwellings, which we refer to as the sample coincident peak. This is in contrast to Section 4.1 where peak demand was predefined as the time between 17:00 and 21:00 in winter and section 4.2 where peak demand was predefined as the 100 half hours with the largest demand based on regional GXP data.

Table \@ref(tab:sampleCoincidentPeak) identifies the half hours of highest overall demand in 2015 using the sum of the half-hourly mean power values (sumOfmeanPowerW). As we can see the number of dwellings contributing to this sum is between 31 and 33 in each half hour and the table is sorted in descending order of the mean total power across all contributing dwellings. We control for ‘missing dwellings’ by sorting by and using the mean total power (last column) as our indicator. The complete data table is saved as “Extracted winter peak load half-hours in descending order for 2015” and is supplied with the report.

#sampleDT <- labelEECACircuits(powerDT) # why??

sampleShortDT <- powerDT[, .(sumOfmeanPowerW = sum(meanPowerW, na.rm = TRUE), 
                              nDwellings=.N), keyby=.(eecaCircuit, date, season, year, obsHalfHour)]#This is the aggregated Wattage for each circuit month etc. see keyby in order to conduct sample coincident peak analysis 


sampleShortDT <- sampleShortDT[, averagePowerW := sumOfmeanPowerW/nDwellings]#We have to consider that some of the obs have only one obs, some have up to 200. We need to take the average to identify the top 20 entries (their time of occurance)


sample2015DT <- sampleShortDT[year == 2015] # Getting rid of years that are not necessary for this analysis

#sample2015DT <- sort(sample2015DT$averagePowerW, decreasing = TRUE)

sample2015DT <- setorder(sample2015DT, -averagePowerW)

kableExtra::kable(head(sample2015DT[year == "2015"], 10), digits = 2,
                  caption = "Example data: Extracted winter peak load half-hours in descending order for 2015")  %>% # test result
  kable_styling()


data.table::fwrite(sample2015DT, paste0(here::here(), "/data/output/Extracted winter peak load half-hours in descending order for 2015.csv"))




#Now identufy these top20 times in the original data table and conduct analysis again. We only use this data table to identify the peak half hours. Now we go back to original data files

####Filtering the top20 peak half-hours based on our sample. XXThis is hard-coded - is there a better way to do it??XX

# sample2015DT is in descending order so head(dt, 20) will give us the top 20
filterDT <- head(sample2015DT[, .(date, obsHalfHour, averagePowerW)], 20)

# filteredDT <- powerDT %>% filter(date == "2015-06-28" & obsHalfHour == hms::as.hms("18:00:00") |
#                                    date == "2015-08-21" & obsHalfHour == hms::as.hms("07:00:00") |
#                                    date == "2015-06-23" & obsHalfHour == hms::as.hms("18:00:00") |
#                                    date == "2015-06-24" & obsHalfHour == hms::as.hms("07:00:00") |
#                                    date == "2015-05-29" & obsHalfHour == hms::as.hms("07:00:00") |
#                                    date == "2015-06-17" & obsHalfHour == hms::as.hms("07:00:00") |
#                                    date == "2015-08-12" & obsHalfHour == hms::as.hms("07:00:00") |
#                                    date == "2015-06-23" & obsHalfHour == hms::as.hms("17:30:00") |
#                                    date == "2015-06-26" & obsHalfHour == hms::as.hms("07:00:00") |
#                                    date == "2015-06-25" & obsHalfHour == hms::as.hms("07:30:00") |
#                                    date == "2015-06-23" & obsHalfHour == hms::as.hms("18:30:00") |
#                                    date == "2015-05-25" & obsHalfHour == hms::as.hms("19:00:00") |
#                                    date == "2015-06-23" & obsHalfHour == hms::as.hms("07:00:00") |
#                                    date == "2015-08-10" & obsHalfHour == hms::as.hms("20:00:00") |
#                                    date == "2015-06-26" & obsHalfHour == hms::as.hms("07:30:00") |
#                                    date == "2015-07-10" & obsHalfHour == hms::as.hms("07:30:00") |
#                                    date == "2015-08-28" & obsHalfHour == hms::as.hms("07:00:00") |
#                                    date == "2015-07-09" & obsHalfHour == hms::as.hms("07:30:00") |
#                                    date == "2015-06-23" & obsHalfHour == hms::as.hms("19:30:00") |
#                                    date == "2015-05-27" & obsHalfHour == hms::as.hms("07:00:00"))

setkey(powerDT, date, obsHalfHour)
setkey(filterDT, date, obsHalfHour)
filteredDT <- powerDT[filterDT]
#filteredDT <- labelEECACircuits(filteredDT) # not needed

contriSampleDT <- dcast(filteredDT,
           linkID + year + season + r_dateTime_nz ~ eecaCircuit, 
           fun = mean, # won't do anything as we should just have 1 value?
           fill = NA, # so we don't get NaN
           value.var = "meanPowerW") # take 1 household and turn data round

# Now just calculate the % using the identified circuits?


contriSampleDT <- contriSampleDT[, pc_HW := `Hot water`/Total]
contriSampleDT <- contriSampleDT[, pc_HP := `Heat Pump or Heating`/Total]
contriSampleDT <- contriSampleDT[, pc_LI := `Lighting`/Total]
contriSampleDT <- contriSampleDT[, pc_OV := `Oven`/Total]
contriSampleDT <- contriSampleDT[, pc_Other := Other/Total]

#Eval=False, we use the pie chart instead
summaryPSDT <- copy(contriSampleDT) #This is the summary table to present to EECA for the first definition of peak demand

summaryPSDT<-summaryPSDT[, c("Heat Pump or Heating", "Hot water", "Lighting", "Oven", "Total", "Other") := NULL]


summaryPSDT <- setnames(summaryPSDT, old=c("pc_HW","pc_HP", "pc_LI", "pc_OV", "pc_Other"),
                        new=c("Hot Water", "Heat Pump", "Lighting", "Oven", "Other"))

summaryPSDT <- data.table::as.data.table(gather(summaryPSDT, eecaLabelContri, contribution, `Hot Water`:`Oven`, factor_key = TRUE))#This brings our data into a nice form for the plot

summaryPSDT <- summaryPSDT[, .(averageContri = mean(contribution, na.rm = TRUE), 
                              nobs=.N, sd=sd(contribution, na.rm = TRUE)), keyby=.(eecaLabelContri, season, year, r_dateTime_nz)]

summaryPSDT <- summaryPSDT[, averageContri := averageContri*100 ]#Conversion to % for plot

summaryPSDT$season <- factor(summaryPSDT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot

Figure \@ref(fig:wattageContributionPeakSamplePlotAll) shows the mean contribution of each identifiable load for the sample to the top 20 identified peak half hours cross the sample. The results show that on average Heat Pumps contribute the most, followed by Hot Water and Lighting. The plot also shows that ‘Other’ has a large contribution. Again, these results should be viewed with some caution due to the non-representative nature of the GREEN Grid sample. While they represent the sample co-incident peak contributions, we cannot assume that this is representative of all New Zealand dwellings.

#This section aims to calculate the wattage contribution. We recycle previous results...

ContriWSDT <- dcast(filteredDT,#This tis the contribution DT for wattage and the first definition of peak demand
           linkID + year + season + r_dateTime_nz ~ eecaCircuit, 
           fun = mean, # won't do anything as we should just have 1 value?
           fill = NA, # so we don't get NaN
           value.var = "meanPowerW")
#ContriWSDT<-ContriWSDT[, c( "Total") := NULL]#We do not need this column

ContriWSDT <- setnames(ContriWSDT, old=c("Heat Pump or Heating","Hot water", "Lighting", "Oven", "Other", "Total"),
                        new=c("Heat Pump", "Hot Water", "Lighting", "Oven", "Other", "Total"))
ContriWSDT <- data.table::as.data.table(gather(ContriWSDT, eecaLabelContri, contribution, `Heat Pump`:`Total`, factor_key = TRUE))#This brings our data into a nice form for the plot

ContriWSDT <- ContriWSDT[, .(averageContri = mean(contribution, na.rm = TRUE), 
                              nobs=.N, sd=sd(contribution, na.rm = TRUE)), keyby=.(eecaLabelContri, season, year, r_dateTime_nz)]

ContriWSDT$season <- factor(ContriWSDT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot

ContriWSDT$eecaLabelContri <- factor(ContriWSDT$eecaLabelContri, levels = c("Hot Water","Heat Pump", "Lighting", "Oven", "Other", "Total"))#This puts the seasons in the right order in the plot

#Adding percentages

#ContriWSDT <- ContriWSDT[r_dateTime_nz == "2015-06-28 18:00:00"]#Request from EECA: Show us all half hours

ContriWSDT[, totalW := sum(averageContri), keyby =.(season, r_dateTime_nz)]
ContriWSDT[, percentage := (averageContri/totalW)*100]

# ContriWSDT$season <- factor(ContriWSDT$season, levels = c("Spring","Summer",
#                                                     "Autumn", "Winter"))#This puts the seasons in the right order in the plot
# ContriWSDT$eecaLabelContri <- factor(ContriWSDT$eecaLabelContri, levels = c("Hot Water","Heat Pump",
#                                                     "Lighting", "Oven", "Other"))#This puts the seasons in the right order in the plot

ContriWSDT[, dateChar := as.factor(r_dateTime_nz)]
ContriWSDT[, dateChar := format(r_dateTime_nz, format='%Y-%m-%d %H:%M')]

myPlot <- ggplot2::ggplot(ContriWSDT[eecaLabelContri != "Total"], 
                          aes(x= reorder(dateChar, averageContri), fill = eecaLabelContri)) +
  geom_bar(aes(y = averageContri), stat = "identity")+
  coord_flip()+
  #scale_y_datetime(r_dateTime_nz= "%Y")+
  theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_text(colour="black"),
        axis.ticks.x = element_line(colour="black"),
        axis.text.y = element_text(colour ="black"))+
    #geom_text(aes(y = averageContri,
        #     label = sprintf("%0.2f",averageContri)),
       #     position = position_stack(vjust = .5))+

  #facet_grid(season ~ .) +
  #scale_x_continuous(breaks = c(seq(1,20))) +
  #ggtitle("Contribution to peak demand (17:00-21:00) in 2015") +
  guides(fill=guide_legend(title="Circuit"))+
  labs(x=' Date times of the top 20 half hours', y='Power in W/house') 
myPlot

myPlot <- ggplot2::ggplot(ContriWSDT[r_dateTime_nz == "2015-06-28 18:00:00"], aes(x= "", y= percentage, fill = eecaLabelContri)) +
  geom_bar(stat = "identity", width = 1, color ="white")+
  #labs(x='', y='percentage')+
   #facet_grid(facets=. ~season) +
  coord_polar("y", start = 0)+
  theme_void()+
 theme(#text = element_text(family = "Cambria"),
       legend.position = 'bottom')+
  guides(fill=guide_legend(title="Circuit"))+
  geom_text(aes(y = percentage,
              label = sprintf("%0.0f%%",percentage)),
           position = position_stack(vjust = .5),
           size = 4)

myPlot

#EVAL=FALSE We use the pie chart instead

#The stacks do not add up to 100 as we have averaged the contributions over all households. XXMention in the textXX

myPlot <- ggplot2::ggplot(summaryPSDT[r_dateTime_nz =="2015-06-28 18:00:00"], aes(x= eecaLabelContri, fill = eecaLabelContri)) +
  geom_bar(aes(y = averageContri), stat = "identity")+
  theme(#text = element_text(family = "Cambria"),
        axis.text.y = element_blank(),
        axis.text.x = element_text(colour = "black"),
        legend.position = "none",
        axis.ticks.y = element_blank())+

    geom_text(aes(y = averageContri,
             label = sprintf("%0.2f",averageContri)),
            position = position_stack(vjust = .5))+

  #facet_grid(season ~ .) +
  #scale_y_continuous(breaks = c()) +
  #ggtitle("Contribution to peak demand (17:00-21:00) in 2015") +
  guides(fill=guide_legend(title="Circuit"))+
  labs(x='Circuit', y='Contribution in %') 
myPlot
#Now do the same for wattage contribution...

Annual Contribution

This section presents annual and seasonal energy consumption (in kWh) for each identifiable load. This analysis considers the year 2015. Figure \@ref(fig:annualBoxPlot) shows the annual total energy consumption (in kWh) for the identifiable loads in a box and whisker plot to show the median, 25%/75% quantiles, whiskers and outliers. Hot Water consumes the most energy per year with a pronounced variation between houses. Lighting, Heat Pump, and Oven follow. “Other” clearly drive large inter-household variation and, as noted above, should be a focus of further work.

# kWh
# annual total per circuit
annualDT <- powerDT[, .(totalAnnualEnergyPerCircuitWh = sum(energyWh, na.rm = TRUE)), 
                              keyby = .(linkID, eecaCircuit, year)] 

# Calculating the contribution to annual energy consumption and turning off e-notation

options(scipen = 999)

annualWideDT <- dcast(annualDT, 
           linkID + year ~ eecaCircuit, 
           fun = mean, # won't do anything as we should just have 1 value?
           fill = NA, # so we don't get NaN
           value.var = "totalAnnualEnergyPerCircuitWh") # take 1 household and turn data round

annualWideDT <- annualWideDT[, pc_HW := `Hot water`/Total]
annualWideDT <- annualWideDT[, pc_HP := `Heat Pump or Heating`/Total]
annualWideDT <- annualWideDT[, pc_LI := `Lighting`/Total]
annualWideDT <- annualWideDT[, pc_OV := `Oven`/Total]
annualWideDT <- annualWideDT[, pc_Other := Other/Total]
# There are a lot of NA/Nan in the data. XXSensible to turn NaN into NA to receive more results??XX

annualWideDT <- setnames(annualWideDT, old=c("Heat Pump or Heating","Hot water", "Lighting", "Oven", "Other"),
                        new=c("Heat Pump", "Hot Water", "Lighting", "Oven", "Other"))

annualWideDT <- annualWideDT[, c( "Total") := NULL]#We do not need this column

#Summarising the variation in demand
#We need to dcast it back into the original form
plotDT <- data.table::as.data.table(gather(annualWideDT, eecaCircuit, totalAnnualEnergyPerCircuitWh, `Heat Pump`:`Oven`, factor_key = TRUE))

plotDT[, totalAnnualEnergyPerCircuitkWh := totalAnnualEnergyPerCircuitWh/1000]

myPlot <- ggplot2::ggplot(plotDT[year == 2015], aes((y=totalAnnualEnergyPerCircuitkWh), x=eecaCircuit)) +
  geom_boxplot() +
  labs(x = "Circuit", y= "Annual Energy consumption (kWh)") +
  #scale_x_continuous( limits = c(0,2000))+
  #facet_grid( season ~ year)+
  theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_text(colour = "black"),
        axis.text.y = element_text(colour = "black"),
        axis.ticks.y = element_line(colour = "black"),
        axis.ticks.x = element_line(colour = "black"))+
  theme(#text = element_text(family = "Cambria")
    ) 
myPlot

# kWh
# seasonal

seasonalDT <- powerDT[, .(totalSeasonalEnergyPerCircuitWh = sum(energyWh, na.rm = TRUE)), 
                         keyby = .(linkID, eecaCircuit, year, season)] #Ignores NAs

# Calculating the contribution to annual energy consumption and turning off e-notation

options(scipen = 999)

seasonalWideDT <- dcast(seasonalDT, 
           linkID + year + season ~ eecaCircuit, 
           fun = mean, # won't do anything as we should just have 1 value?
           fill = NA, # so we don't get NaN
           value.var = "totalSeasonalEnergyPerCircuitWh") # take 1 household and turn data round

seasonalWideDT <- seasonalWideDT[, pc_HW := `Hot water`/Total]
seasonalWideDT <- seasonalWideDT[, pc_HP := `Heat Pump or Heating`/Total]
seasonalWideDT <- seasonalWideDT[, pc_LI := `Lighting`/Total]
seasonalWideDT <- seasonalWideDT[, pc_OV := `Oven`/Total]
seasonalWideDT <- seasonalWideDT[, pc_Other := Other/Total]
# There are a lot of NA/Nan in the data. XXSensible to turn NaN into NA to receive more results??XX

seasonalWideDT <- setnames(seasonalWideDT, old=c("Heat Pump or Heating","Hot water", "Lighting", "Oven", "Other"),
                        new=c("Heat Pump", "Hot Water", "Lighting", "Oven", "Other"))


seasonalWideDT <- seasonalWideDT[, c( "Total") := NULL]#We do not need this column


#Summarising the variation in demand
#We need to dcast it back into the original form
plotDT <- data.table::as.data.table(gather(seasonalWideDT, eecaCircuit, totalSeasonalEnergyPerCircuitWh, `Heat Pump`:`Oven`, factor_key = TRUE))

plotDT$season <- factor(plotDT$season, levels = c("Spring","Summer",
                                                    "Autumn", "Winter"))#This puts the seasons in the right order in the plot


plotDT$eecaCircuit <- factor(plotDT$eecaCircuit, levels = c("Hot Water","Heat Pump",
                                                    "Lighting", "Oven", "Other"))#This puts the seasons in the right order in the plot

Figure \@ref(fig:annualBoxPlot) conceals the significant variation in demand between seasons and in the following we use density plots to show the shape of the distribution for each circuit by season.

Figure \@ref(fig:annualConsPlotHW) shows the distribution of seasonal energy consumption in kWh across all houses for Hot Water in 2015 using a density plot. This shows that most energy consumption for Hot Water is in winter, followed by spring and autumn. Energy consumption for hot water in summer is substantially lower with a distinctively different pattern to both the spring/autumn seasons and winter.

Example histogram saved as .png

myPlot <- ggplot2::ggplot(plotDT[eecaCircuit =="Hot Water"], aes(x=totalSeasonalEnergyPerCircuitWh/1000, colour=season)) +
  geom_density() +
  labs(x = "Seasonal energy (kWh)") +
  #scale_x_continuous( limits = c(0,2000))+
     geom_vline(aes(xintercept=mean(totalSeasonalEnergyPerCircuitWh/1000)),
            color="blue", linetype="dashed", size=0.5)+
  #facet_grid( season ~ year)+

  theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_text(colour = "black"),
        axis.text.y = element_text(colour = "black"),
        axis.ticks.y = element_line(colour = "black"),
        axis.ticks.x = element_line(colour = "black"))
myPlot

myhisto <- ggplot2::ggplot(plotDT[eecaCircuit =="Hot Water"], aes(x=totalSeasonalEnergyPerCircuitWh/1000, fill=season, alpha = 0.5)) +
  geom_histogram(position = "dodge") +
  labs(x = "Seasonal energy (kWh)") +
  #scale_x_continuous( limits = c(0,2000))+
     geom_vline(aes(xintercept=mean(totalSeasonalEnergyPerCircuitWh/1000)),
            color="blue", linetype="dashed", size=0.5)+
  #facet_grid( season ~ year)+

  theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_text(colour = "black"),
        axis.text.y = element_text(colour = "black"),
        axis.ticks.y = element_line(colour = "black"),
        axis.ticks.x = element_line(colour = "black"))

# save it
ggsave(paste0(here::here(), "/docs/plots/contribHistoTest.png"), myhisto)

In contrast, Figure \@ref(fig:annualConsPlotHP) presents a similar density plot for Heat Pumps. As expected, most energy consumption is in winter. In summer, total energy consumption is mostly 0 (appliance is turned off). Only a few observations registered energy consumption in summer which may be cooling.

myPlot <- ggplot2::ggplot(plotDT[eecaCircuit =="Heat Pump"], aes(x=totalSeasonalEnergyPerCircuitWh/1000, colour=season)) +
  geom_density() +
  labs(x = "Seasonal energy (kWh)") +
  #scale_x_continuous( limits = c(0,500))+
     geom_vline(aes(xintercept=mean(totalSeasonalEnergyPerCircuitWh/1000)),
            color="blue", linetype="dashed", size=0.5)+
  #facet_grid( season ~ year)+

  theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_text(colour = "black"),
        axis.text.y = element_text(colour = "black"),
        axis.ticks.y = element_line(colour = "black"),
        axis.ticks.x = element_line(colour = "black"))
myPlot

Energy consumption of Lighting is particularly apparent in spring, autumn, and winter as presented in Figure \@ref(fig:annualConsPlotL). Less energy consumption is identified in summer.

myPlot <- ggplot2::ggplot(plotDT[eecaCircuit =="Lighting"], aes(x=totalSeasonalEnergyPerCircuitWh/1000, colour=season)) +
  geom_density() +
  labs(x = "Seaonal energy (kWh)") +
  #scale_x_continuous( limits = c(0,500))+
     geom_vline(aes(xintercept=mean(totalSeasonalEnergyPerCircuitWh/1000)),
            color="blue", linetype="dashed", size=0.5)+
  #facet_grid( season ~ year)+

  theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_text(colour = "black"),
        axis.text.y = element_text(colour = "black"),
        axis.ticks.y = element_line(colour = "black"),
        axis.ticks.x = element_line(colour = "black"))
myPlot

For ovens, there is less seasonal variation in seasonal energy consumption as shown in Figure \@ref(fig:annualConsPlotOven). Again, energy consumption in summer is less than in other seasons.

myPlot <- ggplot2::ggplot(plotDT[eecaCircuit =="Oven"], aes(x=totalSeasonalEnergyPerCircuitWh/1000, colour=season)) +
  geom_density() +
  labs(x = "Seasonal energy (kWh)") +
  #scale_x_continuous( limits = c(0,500))+
     geom_vline(aes(xintercept=mean(totalSeasonalEnergyPerCircuitWh/1000)),
            color="blue", linetype="dashed", size=0.5)+
  #facet_grid( season ~ year)+

  theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_text(colour = "black"),
        axis.text.y = element_text(colour = "black"),
        axis.ticks.y = element_line(colour = "black"),
        axis.ticks.x = element_line(colour = "black"))
myPlot

For "Other", there is a lot of seasonal variation in seasonal energy consumption as shown in Figure \@ref(fig:annualConsPlotOther). As this circuit contains appliances not further specified it is difficult to identify the appliance causing this energy consumption, particularly in summer. Therefore, further research is necessary.

myPlot <- ggplot2::ggplot(plotDT[eecaCircuit =="Other"], aes(x=totalSeasonalEnergyPerCircuitWh/1000, colour=season)) +
  geom_density() +
  labs(x = "Seasonal energy (kWh)") +
  #scale_x_continuous( limits = c(0,500))+
     geom_vline(aes(xintercept=mean(totalSeasonalEnergyPerCircuitWh/1000)),
            color="blue", linetype="dashed", size=0.5)+
  #facet_grid( season ~ year)+

  theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_text(colour = "black"),
        axis.text.y = element_text(colour = "black"),
        axis.ticks.y = element_line(colour = "black"),
        axis.ticks.x = element_line(colour = "black"))
myPlot

#For some reason I am not able to chnage the legend title below so I will just rename the variable in a copied data table...


dt2 <- copy(plotDT)

dt2 <- setnames(dt2, old=c("eecaCircuit"),
                        new=c("Circuit"))

myPlot <- ggplot2::ggplot(dt2[year == 2015 & season == "Winter"], aes(x=totalSeasonalEnergyPerCircuitWh/1000, colour=Circuit)) +
  geom_density() +
  labs(x = "Energy consumption in winter (kWh)") +
  #scale_x_continuous( limits = c(0,4000))+
  facet_grid( season ~ year) +
  theme(#text = element_text(family = "Cambria"),
        axis.text.x = element_text(colour = "black"),
        axis.text.y = element_text(colour = "black"),
        axis.ticks.y = element_line(colour = "black"),
        axis.ticks.x = element_line(colour = "black"),
        legend.title = element_text(colour = "black")
        )

myPlot

Finally, Figure \@ref(fig:annualConsPlotWinter) shows the variation in all circuits for winter 2015. On average, Hot Water used the most energy, followed by Heat Pumps, Lighting, and Ovens but 'Other' shows a wide variation which again warrants further investigation.

Load Factor

#___A___Calculatig average demand by household, circuit


selectDT <- powerDT[year == 2015]

AverageDemandDT <- selectDT[, .(avgDemandW = mean(meanPowerW, na.rm = TRUE)), keyby =.(linkID, eecaCircuit)]

#___B___Calculating average demand at peak by household, circuit

selectDT <- powerDT[season == "Winter" & 
                    year == 2015 &
                    obsHalfHour >= hms::as.hms("17:00:00") & 
                    obsHalfHour <= hms::as.hms("21:00:00")]


AveragePeakDemandDT <- selectDT[, .(avgPeakDemandW = mean(meanPowerW, na.rm = TRUE)), keyby =.(linkID, eecaCircuit)]

#Merge data tables
setkey(AverageDemandDT, linkID, eecaCircuit)
setkey(AveragePeakDemandDT, linkID, eecaCircuit)


MergedDT <- AveragePeakDemandDT[AverageDemandDT]

#Calculating LF as being peak demand / average demand 

MergedDT <- MergedDT[, LoadFactor := NA]
MergedDT <- MergedDT[, LoadFactor := avgDemandW / avgPeakDemandW]

MergedDT <- MergedDT[, CLF := 1000/(LoadFactor*8760)]


#Calculating CLF as being (LF/1000) / 8760

#MergedDT <- MergedDT[, CLF := NA]
#MergedDT <- MergedDT[, CLF := LoadFactor*1000 / 8760]

#Write CSV


data.table::fwrite(MergedDT, paste0(here::here(), "/data/output/Extracted load factor by household and circuit.csv"))


MergedTable <- MergedDT[linkID == "rf_13" | linkID=="rf_06"]

Merged2DT <- copy(MergedDT)

Merged2DT <- dcast(Merged2DT, 
           linkID ~ eecaCircuit, 
           fun = mean, # won't do anything as we should just have 1 value?
           #fill = NA, # so we don't get NaN
           value.var = "CLF") # take 1 household and turn data round


Merged2DT <- setnames(Merged2DT, old=c("Heat Pump or Heating","Hot water", "Lighting", "Oven", "Other"),
                        new=c("Heat Pump", "Hot Water", "Lighting", "Oven", "Other"))


Merged2DT <- Merged2DT[, c( "Total") := NULL]#We do not need this column


#Summarising the variation in demand
#We need to dcast it back into the original form
dt <- data.table::as.data.table(gather(Merged2DT, eecaCircuit, CLF, `Heat Pump`:`Oven`, factor_key = TRUE))


dt$eecaCircuit <- factor(dt$eecaCircuit, levels = c("Hot Water","Heat Pump",
                                                    "Lighting", "Oven", "Other"))#This puts the seasons in the right order in the plo

dt3 <- dt[, .(meanCLF = mean(CLF, na.rm= TRUE)), keyby =. (eecaCircuit)]

The Load Factor (LF) is defined as the ratio of mean demand (W) and the peak demand (W) for each load. In this section we define peak demand to be the mean demand between 17:00-21:00 in winter. The LF provides a measure of how much the load contributes to peak. If the LF for a particular load is 1 then the load is flat. If the LF is less than 1 this shows that the load disproportionately contributes to the peak. Note that values greater than 1 indicate that demand during the peak is reduced below the mean. An alternative measure is given by the conservation load factor defined by:

`CLF = Peak W/((Average W*8760)/1000)`

The CLF represents the ratio of the power demand at peak to the annual energy consumption over the year (in W/kWh). The advantage of the CLF is that is gives a larger value when the load makes larger contribution to the peak. Table \@ref(tab:clfTableExample) shows the first rows of our calculation of LF and CLF.

kableExtra::kable(head(MergedTable), digits = 2,
                  caption = "Example data: Extracted load factor by household and circuit")  %>% # test result
  kable_styling()

The full data table containing LFs and CLFs (if available) for each house is saved as “Extracted load factor by household and circuit.csv” and supplied with the report.

Figure \@ref(fig:consLoadPlot) shows the Conservation Load Factor (CLF) for each identifiable load averaged over all houses in the sample. Results show that the CLF for Heat Pumps (0.55) is the highest, followed by Oven (0.43), Lighting (0.38), and Hot Water (0.17). 'Other' had a CLF of 0.24.

myPlot <- ggplot2::ggplot(dt3, aes(x= eecaCircuit, fill = eecaCircuit)) +
  geom_bar(aes(y = meanCLF), stat = "identity")+
  theme(#text = element_text(family = "Cambria"),
        axis.text.y = element_blank(),
        axis.text.x = element_text(colour = "black"),
        legend.position = "none",
        axis.ticks.y = element_blank())+

    geom_text(aes(y = meanCLF,
             label = sprintf("%0.2f",meanCLF)),
            position = position_stack(vjust = .5))+

  #facet_grid(season ~ .) +
  #scale_y_continuous(breaks = c()) +
  #ggtitle("Contribution to peak demand (17:00-21:00) in 2015") +
  guides(fill=guide_legend(title="Circuit"))+
  labs(x='Circuit', y='Conservation Load Factor') 
myPlot

Note that these numbers are means over all households, dates, and times for 2015. This means that each household contributes equally to the mean and it is not weighted by its total demand. For example, the load factor from households with very low hot water demand will contribute equally to the mean.

In addition, it is clear that the mean CLF over all houses is skewed by some houses with CLF > 0.9 as Figure \@ref(fig:clfBox) shows.

myPlot <- ggplot2::ggplot(dt, aes(x= eecaCircuit, y=CLF)) +
  geom_boxplot() +
  theme(#text = element_text(family = "Cambria"),
        axis.text.y = element_text(colour = "black"),
        axis.text.x = element_text(colour = "black"),
        legend.position = "none",
        axis.ticks.y = element_line(colour = "black"))+



  #facet_grid(season ~ .) +
  #scale_y_continuous(breaks = c()) +
  #ggtitle("Contribution to peak demand (17:00-21:00) in 2015") +
  guides(fill=guide_legend(title="Circuit"))+
  labs(x='Circuit', y='Conservation Load Factor') 
myPlot

Summary

This report has used the GREEN Grid sample of circuit-level household electricity demand data to analyse the contribution of heat pumps, hot water, lighting, and ovens to peak demand. Calculations have been carried out using a variety of definitions of “peak”. In addition, we also calculated the annual energy consumption of each identifiable load in the sample and a conservation load factor which is a measure of peak demand to annual energy consumption for each identifiable load.

The results show plausible results, however there are a number of key considerations in extrapolating these results:

Unidentified loads represented a significant contribution in the sample
Means calculated over the sample can significantly distort the results
The non-representative nature of the sample means that the results cannot be considered to be representative of New Zealand households

These short comings suggest that future work should focus on (1) collecting data from a more representative sample of New Zealand households, (2) ensuring that appliances are identified on circuits (especially electric heaters); (3) variation across the sample be accounted for in any results.

Data Annex

Categorised circuits

Table \@ref(tab:categoriseCircuitsTable) shows the categorised circuit labels used in this analysis. It would be relatively easy to adjust these definitions in future work if required.

t <- with(powerDT, table(circuit, eecaCircuit))

kableExtra::kable(t, caption = "Definition of 'eeca circuits' (columns) based on original circuits (rows). Values are the count of half-hours in the complete power data")  %>% # test result
  kable_styling()

Original data description

Descriptive statistics for aggregate half hourly power data for all households and all circuits:

skimr::skim(powerDT)

The following tables show descriptive statistics for the meanPowerW values for each circuit by household.

linkIDs <- unique(powerDT$linkID)
# loop over households
# haven't worked out how to increment the table number

for(hh in linkIDs){
  # prints a lot of badly formatted tables
    t <- powerDT[linkID == hh, 
    .(meanPowerW = mean(meanPowerW, na.rm = TRUE),
      minPowerW = min(meanPowerW, na.rm = TRUE),
      maxPowerW = max(meanPowerW, na.rm = TRUE),
      nObs = .N), keyby = .(circuit, eecaCircuit)]
    print(kableExtra::kable(t, 
                            digits = 2,
                            caption = paste0(hh, ": Mean of half-hourly mean power (W) by original circuit label & EECA circuit type"))) 
    cat("\n")
}

Runtime

t <- proc.time() - startTime
elapsed <- t[[3]]

Analysis completed in r round(elapsed,2) seconds ( r round(elapsed/60,2) minutes) using knitr in RStudio with r R.version.string running on r R.version$platform.

R environment

R packages used

base R [@baseR]
bookdown [@bookdown]
data.table [@data.table]
ggplot2 [@ggplot2]
kableExtra [@kableExtra]
knitr [@knitr]
lubridate [@lubridate]
rmarkdown [@rmarkdown]

Session info

sessionInfo()

References

CfSOtago/GREENGridEECA documentation built on May 1, 2022, 8:08 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

CfSOtago/GREENGridEECA
Supports Analysis of NZ GREEN Grid data for EECA

In CfSOtago/GREENGridEECA: Supports Analysis of NZ GREEN Grid data for EECA

About

Citation

Report circulation:

License

History

Support

Introduction

Data

Peak Contribution

Winter (June - August) evenings 17:00 – 21:00

Maximum load per season (W)

Region/network coincident peak

Hawke's Bay

Taranaki

Sample coincident peak

Annual Contribution

Load Factor

Summary

Data Annex

Categorised circuits

Original data description

Runtime

R environment

R packages used

Session info

References

R Package Documentation

Browse R Packages

We want your feedback!

CfSOtago/GREENGridEECA Supports Analysis of NZ GREEN Grid data for EECA

In CfSOtago/GREENGridEECA: Supports Analysis of NZ GREEN Grid data for EECA

About

Citation

Report circulation:

License

History

Support

Introduction

Data

Peak Contribution

Winter (June - August) evenings 17:00 – 21:00

Maximum load per season (W)

Region/network coincident peak

Hawke's Bay

Taranaki

Sample coincident peak

Annual Contribution

Load Factor

Summary

Data Annex

Categorised circuits

Original data description

Runtime

R environment

R packages used

Session info

References

R Package Documentation

Browse R Packages

We want your feedback!

CfSOtago/GREENGridEECA
Supports Analysis of NZ GREEN Grid data for EECA