knitr::opts_chunk$set(echo = TRUE)
require(flowBasedClustering)

flowBasedClustering allows to cluster typical flow-based days and to visualize their domains.

1 - Summary

# Define a calendar (period on which the clustering will be made + distinction of each season)
dates <- getSequence("2015-11-01", "2017-01-20")
interSeasonBegin <- c("2016-03-01", "2016-10-01")
interSeasonEnd <- c("2016-05-15", "2016-10-31")
calendar <- getCalendar(dates, interSeasonBegin, interSeasonEnd)

# Identify PTDF in a csv file
ptdf_file_path <- system.file("dataset/ptdf_example.csv", package = "flowBasedClustering")
# (note : this small dataset does not span on all the calendar defined above)

# Transform PTDF data to vertices
vertices <- ptdfToVertices(PTDF = ptdf_file_path, nbCore = 4)

# Cluster the typical days
clusterTD <- clusteringTypicalDays(calendar, vertices, nbClustWeek = 3, 
                      nbClustWeekend = 1, report = TRUE)

# Get probabilities and quantiles from climate file in order to classify the typical days
climate <- fread(system.file("dataset/climate_example.txt",package = "flowBasedClustering"))
probMatrix <- getProbability(climate, cluster = clusterTD, levelsProba = c(1/3, 2*3))

2 - Calendar definition

First, the function getSequence() can be used to build a vector of consecutive dates.

getSequence("2015-11-01", "2015-11-05")

On this period, the different seasons (winter, summer and interseason) can be identified using the getCalendar() function. It uses the main following parameters :

The function returns a list with vectors of dates for week-end and working days for both interseason, winter and summer.

dates <- getSequence("2017-01-01", "2017-12-31")
interSeasonBegin <- c("2017-03-01", "2017-09-01")
interSeasonEnd <- c("2017-05-01", "2017-11-01")
cal <- getCalendar(dates, interSeasonBegin, interSeasonEnd)
str(cal)

3 - PTDF to vertices

PTDF are the equations of the hyperplanes defining the limits of the domain, while the vertices are the coordinnates of the extreme points of the domain.

The function ptdfToVertices() transforms a PTDF file to vertices data. This computation can take some time, and so the function allows to do it on several threads.

The PTDF file must contains at least these seven columns Date, Period, BE, DE, FR, NL, RAM in this order :

ptdf_data <- data.table::fread(system.file("dataset/ptdf_example.csv",package = "flowBasedClustering"),
                   data.table = F)
head(ptdf_data[, 1:7])

The Period lies between 1 and 24 for the 24 hourly market periods. Therefore, period i corresponds to the time-step [i-1 ; i]. And the result is a data.table which contains vertices.

vertices <- data.table::fread(system.file("dataset/vertices_example.txt",package = "flowBasedClustering"))
head(vertices)

4 - Clustering of typical days

Once we have the vertices and the calendar, the clustering algorithm can be run with the function clusteringTypicalDays(). The number of clusters of the different day categories (week days and week-end) can be set with nbClustWeek and nbClustWeekend. A report can be automaticcaly built using report and reportPath arguments.

clusterTD <- clusteringTypicalDays(calendar, vertices, nbClustWeek = 3, nbClustWeekend = 1,
                              report = TRUE)

In case the clustering algorithm must be run on short periods of time or with more or less seasons or types of day. The function clusterTypicalDaysForOneClass() can be used instead of clusteringTypicalDays().

clusterTD <- clusterTypicalDaysForOneClass(vertices = vertices, dates = getSequence("2018-01-01", "2018-01-31"), nbCluster = 2,className = "january2018")

One can also use the output of clusteringTypicalDays() or clusterTypicalDaysForOneClass() to generate report later for one or more typical day(s) using generateClusteringReport(), or just plot domains with clusterPlot() :

clusterTD <- readRDS(system.file("dataset/cluster_example.RDS",package = "flowBasedClustering"))
# build report for one typical day
generateClusteringReport(dayType = 7, data = clusterTD)
Drawing
Drawing

With clusterPlot(), one can choose 2 countries (country1 & country2), the hour and the dayType (i.e. typical day identifer). In options, it can be decided whether to plot only the typical day (typicalDayOnly) or the full cluster and whether to use static or dynamic visualization (ggplot)

# static graphic
clusterPlot(clusterTD, country1 = "FR", country2 = "DE", 
            hour = 8, dayType = 9, typicalDayOnly = FALSE, ggplot = TRUE)
# dynamic graphic
clusterPlot(clusterTD, country1 = "FR", country2 = "DE", 
            hour = 8, dayType = 9, typicalDayOnly = TRUE, ggplot = FALSE)

5 - Probabilities and quantiles

Finally, the function getProbability() can be used to learn the correlations between the occurences of the different clusters and external data (typically climatic data).

Below, an example of a climate file :

climate <- data.table::fread(system.file("dataset/climate_example.txt",package = "flowBasedClustering"))
head(climate)

The quantiles are set with levelsProba, and can be :

# same quantiles for each variables 
MatProb <- getProbability(climate, clusterTD, levelsProba = c(1/3, 2/3))

# diffents quantiles for each class and variables 
levelsProba <- list(summerWd = list(FR_load = c(0.5), DE_wind = c(1/3, 2/3), DE_solar = .5),
                    winterWD = list(FR_load = c(0.5, 0.7), DE_wind = c(.5)))
MatProb <- getProbability(climate, clusterTD, levelsProba = levelsProba)

6 - User guide

Drawing


rte-antares-rpackage/flowBasedClustering documentation built on Nov. 21, 2020, 11:21 a.m.