Description Usage Arguments Value Note References Examples
Conducts period 1 analysis; selects the optimal set of variables that minimizes a k-fold CV error measure and establishes a machine learning model that predicts power output of REF and CTR-b turbines by using period 1 data.
1 | analyze.p1(train, test, ratedPW)
|
train |
A list containing k datasets that will be used to train the machine learning model. |
test |
A list containing k datasets that will be used to test the machine learning model and calculate CV error measures. |
ratedPW |
A kW value that describes the (common) rated power of the selected turbines (REF and CTR-b). |
The function returns a list containing period 1 analysis results as follows.
opt.cov
A character vector presenting the names of predictor variables chosen for the optimal set.
pred.REF
A list of k datasets each representing the kth fold's period 1 prediction for the REF turbine.
pred.CTR
A list of k datasets each representing the kth fold's period 1 prediction for the CTR-b turbine.
err.REF
A data frame containing k-fold CV based RMSE values and BIAS values for the REF turbine model (so k of them for both). The first column includes the RMSE values and the second column includes the BIAS values.
err.CTR
A data frame containing
k-fold CV based RMSE values and BIAS values for the CTR-b turbine
model. Similarly structured with err.REF
.
biasCurve.REF
A k by m matrix describing the binned BIAS (technically speacking, ‘residuals’ which are the negative BIAS) curve for the REF turbine model, where m is the number of power bins.
biasCurve.CTR
A k by m matrix describing the binned BIAS curve for the CTR-b turbine model.
VERY IMPORTANT!
Selecting the optimal set of variables will take a significant amount of time. For example, with a typical size of an annual dataset, the evaluation of one set of variables for a single fold testing may take about 20-40 minutes (from the authors' experience).
To help understand the progress of the selection, some informative messages will be displayed while this function runs.
H. Hwangbo, Y. Ding, and D. Cabezon, 'Machine Learning Based Analysis and Quantification of Potential Power Gain from Passive Device Installation,' arXiv:1906.05776 [stat.AP], Jun. 2019. https://arxiv.org/abs/1906.05776.
1 2 3 4 5 6 7 8 9 10 11 12 13 | df.ref <- with(wtg, data.frame(time = time, turb.id = 1, wind.dir = D,
power = y, air.dens = rho))
df.ctrb <- with(wtg, data.frame(time = time, turb.id = 2, wind.spd = V,
power = y))
df.ctrn <- df.ctrb
df.ctrn$turb.id <- 3
data <- arrange.data(df.ref, df.ctrb, df.ctrn, p1.beg = '2014-10-24',
p1.end = '2014-10-25', p2.beg = '2014-10-25', p2.end = '2014-10-26',
k.fold = 2)
p1.res <- analyze.p1(data$train, data$test, ratedPW = 1000)
p1.res$opt.cov #This provides the optimal set of variables.
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.