analyze.gain: Analyze Potential Gain from Passive Device Installation on...

Description Usage Arguments Details Value Note References See Also Examples

Description

Implements the gain analysis as a whole; this includes data arrangement, period 1 analysis, period 2 analysis, and gain quantification.

Usage

1
2
3
4
analyze.gain(df1, df2, df3, p1.beg, p1.end, p2.beg, p2.end, ratedPW, AEP,
  pw.freq, freq.id = 3, time.format = "%Y-%m-%d %H:%M:%S",
  k.fold = 5, col.time = 1, col.turb = 2, bootstrap = NULL,
  free.sec = NULL, neg.power = FALSE)

Arguments

df1

A dataframe for reference turbine data. This dataframe must include five columns: timestamp, turbine id, wind direction, power output, and air density.

df2

A dataframe for baseline control turbine data. This dataframe must include four columns: timestamp, turbine id, wind speed, and power output.

df3

A dataframe for neutral control turbine data. This dataframe must include four columns and have the same structure with df2.

p1.beg

A string specifying the beginning date of period 1. By default, the value needs to be specified in %Y-%m-%d format, for example, '2014-10-24'. A user can use a different format as long as it is consistent with the format defined in time.format below.

p1.end

A string specifying the end date of period 1. For example, if the value is '2015-10-24', data observed until '2015-10-23 23:50:00' would be considered for period 1.

p2.beg

A string specifying the beginning date of period 2.

p2.end

A string specifying the end date of period 2. Defined similarly as p1.end.

ratedPW

A kW value that describes the (common) rated power of the selected turbines (REF and CTR-b).

AEP

A kWh value describing the annual energy production from a single turbine.

pw.freq

A matrix or a dataframe that includes power output bins and corresponding frequency in terms of the accumulated hours during an annual period.

freq.id

An integer indicating the column number of pw.freq that describes the frequency of power bins in terms of the accumulated hours during an annual period. By default, this parameter is set to 3.

time.format

A string describing the format of time stamps used in the data to be analyzed. The default value is '%Y-%m-%d %H:%M:%S'.

k.fold

An integer defining the number of data folds for the period 1 analysis and prediction. In the period 1 analysis, k-fold cross validation (CV) will be applied to choose the optimal set of covariates that results in the least prediction error. The value of k.fold corresponds to the k of the k-fold CV. The default value is 5.

col.time

An integer specifying the column number of time stamps in wind turbine datasets. The default value is 1.

col.turb

An integer specifying the column number of turbines' id in wind turbine datasets. The default value is 2.

bootstrap

An integer indicating the current replication (run) number of bootstrap. If set to NULL, bootstrap is not applied. The default is NULL. A user is not recommended to set this value and directly run bootstrap; instead, use bootstrap.gain to run bootstrap.

free.sec

A list of vectors defining free sectors. Each vector in the list has two scalars: one for starting direction and another for ending direction, ordered clockwise. For example, a vector of c(310 , 50) is a valid component of the list. By default, this is set to NULL.

neg.power

Either TRUE or FALSE, indicating whether or not to use data points with a negative power output, respectively, in the analysis. The default value is FALSE, i.e., negative power output data will be eliminated.

Details

Builds a machine learning model for a REF turbine (device installed) and a baseline CTR turbine (CTR-b; without device installation and preferably closest to the REF turbine) by using data measurements from a neutral CTR turbine (CTR-n; without device installation). Gain is quantified by evaluating predictions from the machine learning models and their differences during two different time periods, namely, period 1 (without device installation on the REF turbine) and period 2 (device installed on the REF turbine).

Value

The function returns a list of several objects (lists) that includes all the analysis results from all steps.

data

A list of arranged datasets including period 1 and period 2 data as well as k-folded training and test datasets generated from the period 1 data. See also arrange.data.

p1.res

A list containing period 1 analysis results. This includes the optimal set of predictor variables, period 1 prediction for the REF turbine and CTR-b turbine, the corresponding error measures such as RMSE and BIAS, and BIAS curves for both REF and CTR-b turbine models; see analyze.p1 for the details.

p2.res

A list containing period 2 analysis results. This includes period 2 prediction for the REF turbine and CTR-b turbine. See also analyze.p2.

gain.res

A list containing gain quantification results. This includes effect curve, offset curve, and gain curve as well as the measures of effect (gain without offset), offset, and (the final) gain; see quantify.gain for the details.

Note

References

H. Hwangbo, Y. Ding, and D. Cabezon, 'Machine Learning Based Analysis and Quantification of Potential Power Gain from Passive Device Installation,' arXiv:1906.05776 [stat.AP], Jun. 2019. https://arxiv.org/abs/1906.05776.

See Also

arrange.data, analyze.p1, analyze.p2, quantify.gain

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
df.ref <- with(wtg, data.frame(time = time, turb.id = 1, wind.dir = D,
 power = y, air.dens = rho))
df.ctrb <- with(wtg, data.frame(time = time, turb.id = 2, wind.spd = V,
 power = y))
df.ctrn <- df.ctrb
df.ctrn$turb.id <- 3

# For Full Sector Analysis
res <- analyze.gain(df.ref, df.ctrb, df.ctrn, p1.beg = '2014-10-24',
 p1.end = '2014-10-25', p2.beg = '2014-10-25', p2.end = '2014-10-26',
 ratedPW = 1000, AEP = 300000, pw.freq = pw.freq, k.fold = 2)
# In practice, one may use annual data for each of period 1 and period 2 analysis.
# One may typically use k.fold = 5 or 10.

# For Free Sector Analysis
free.sec <- list(c(310, 50), c(150, 260))

res <- analyze.gain(df.ref, df.ctrb, df.ctrn, p1.beg = '2014-10-24',
 p1.end = '2014-10-25', p2.beg = '2014-10-25', p2.end = '2014-10-26',
 ratedPW = 1000, AEP = 300000, pw.freq = pw.freq, k.fold = 2,
 free.sec = free.sec)

gain.res <- res$gain.res
gain.res$gain    #This will provide the final gain value.

gainML documentation built on June 28, 2019, 5:05 p.m.