conformal.fun.split | R Documentation |
Compute prediction intervals using split conformal inference.
conformal.fun.split(
x,
t,
y,
x0,
train.fun,
predict.fun,
alpha = 0.1,
split = NULL,
seed = FALSE,
randomized = FALSE,
seed_tau = FALSE,
verbose = FALSE,
training_size = 0.5,
s_type = "st-dev"
)
x |
The input variable, a list of n elements. Each element is composed by a list of p vectors(with variable length, since the evaluation grid may change). If x is NULL, the function will sample it from a gaussian. |
t |
The grid points for the evaluation of function y_val. It is a list of vectors. If the y_val data type is "fData" or "mfData" is must be NULL. |
y |
The response variable. It is either, as with x and t, a list of list of vectors or an fda object (of type fd, fData, mfData). |
x0 |
The new points to evaluate, a list of n0 elements. Each element is composed by a list of p vectors(with variable length). |
train.fun |
A function to perform model training, i.e., to produce an estimator of E(Y|X), the conditional expectation of the response variable Y given features X. Its input arguments should be x: list of features, and y: list of responses. |
predict.fun |
A function to perform prediction for the (mean of the) responses at new feature values. Its input arguments should be out: output produced by train.fun, and newx: feature values at which we want to make predictions. |
alpha |
Miscoverage level for the prediction intervals, i.e., intervals with coverage 1-alpha are formed. Default for alpha is 0.1. |
split |
Indices that define the data-split to be used (i.e., the indices define the first half of the data-split, on which the model is trained). Default is NULL, in which case the split is chosen randomly. |
seed |
Integer to be passed to set.seed before defining the random data-split to be used. Default is FALSE, which effectively sets no seed. If both split and seed are passed, the former takes priority and the latter is ignored. |
randomized |
Should the randomized approach be used? Default is FALSE. |
seed_tau |
The seed for the randomized version.Default is FALSE. |
verbose |
Should intermediate progress be printed out? Default is FALSE. |
training_size |
Split proportion between training and calibration set. Default is 0.5. |
s_type |
The type of modulation function. Currently we have 3 options: "identity","st-dev","alpha-max". Default is "std-dev". |
A list with the following components: t,pred,k_s,s_type,s,alpha,randomized,tau, extremes_are_included,average_width,product_integral. t and s are lists of vectors, pred has the same interval structure of y_val, but the outside list is of length n0, k_s, average_width and product_integral are all positive floats, alpha and tau are positive floats less than 1, randomized and extremes_are_included are logical values, while s_type is a string.
The function structure is taken from "Conformal Prediction Bands for Multivariate Functional Data" by Diquigiovanni, Fontana, Vantini (2021) and, also, from "The Importance of Being a Band: Finite-Sample Exact Distribution-Free Prediction Sets for Functional Data" by Diquigiovanni, Fontana, Vantini (2021).
## fData #############################?
N = 20
P = 1e2
grid = seq( 0, 1, length.out = P )
C = roahd::exp_cov_function( grid, alpha = 0.3, beta = 0.4 )
values = roahd::generate_gauss_fdata( N,
centerline = sin( 2 * pi * grid ),
Cov = C )
fD = roahd::fData( grid, values )
x0=list(as.list(grid))
fun=mean_lists()
final.fData = conformal.fun.split(NULL,NULL, fD, x0, fun$train.fun, fun$predict.fun,
alpha=0.1,
split=NULL, seed=FALSE, randomized=FALSE,seed_tau=FALSE,
verbose=TRUE, training_size=0.5,s_type="alpha-max")
plot_fun(final.fData)
### mfData ###################################
N = 1e2
P = 1e3
t0 = 0
t1 = 1
grid = seq( t0, t1, length.out = P )
C = roahd::exp_cov_function( grid, alpha = 0.3, beta = 0.4 )
Data_1 = roahd::generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C )
Data_2 = roahd::generate_gauss_fdata( N, centerline = log(1+ 2 * pi * grid ), Cov = C )
mfD=roahd::mfData( grid, list( Data_1, Data_2 ) )
x0=list(as.list(grid))
fun=mean_lists()
final.mfData = conformal.fun.split(NULL,NULL, mfD, x0, fun$train.fun, fun$predict.fun,
alpha=0.1,
split=NULL, seed=FALSE, randomized=FALSE,seed_tau=FALSE,
verbose=TRUE, training_size=0.5,s_type="alpha-max")
h=plot_fun(final.mfData)
### fd ###########################################
daybasis <- fda::create.fourier.basis(c(0, 365), nbasis=65)
tempfd <- fda::smooth.basis(fda::day.5, fda::CanadianWeather$dailyAv[,,"Temperature.C"],daybasis)$fd
Lbasis <- fda::create.constant.basis(c(0, 365))
Lcoef <- matrix(c(0,(2*pi/365)^2,0),1,3)
bfdobj <- fda::fd(Lcoef,Lbasis)
bwtlist <- fda::fd2list(bfdobj)
harmaccelLfd <- fda::Lfd(3, bwtlist)
Ltempmat <- fda::eval.fd(fda::day.5, tempfd, harmaccelLfd)
t=1:365
x0=list(as.list(grid))
fun=mean_lists()
final.fd = conformal.fun.split(NULL,fda::day.5, tempfd, x0, fun$train.fun, fun$predict.fun,
alpha=0.1,
split=NULL, seed=FALSE, randomized=FALSE,seed_tau=FALSE,
verbose=TRUE, training_size=0.5,s_type="alpha-max")
plot_fun(final.fd)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.