bassPCA: Bayesian Adaptive Spline Surfaces (BASS) with PCA...
In BASS: Bayesian Adaptive Spline Surfaces

bassPCA

R Documentation

Bayesian Adaptive Spline Surfaces (BASS) with PCA decomposition of response

Description

Decomposes a multivariate or functional response onto a principal component basis and fits a BASS model to each basis coefficient.

Usage

bassPCA(
  xx = NULL,
  y = NULL,
  dat = NULL,
  n.pc = NULL,
  perc.var = 99,
  n.cores = 1,
  parType = "fork",
  center = T,
  scale = F,
  ...
)

Arguments

`xx`	a data frame or matrix of predictors with dimension n x p. Categorical predictors should be included as factors.
`y`	a response matrix (functional response) with dimension n x m.
`dat`	optional (for more control) list with elements `xx` (same as above), `y` (same as above), `n.pc` (number of principal components used), `basis` (principal components with dimension m x `n.pc`), `newy` (reduced dimension `y` with dimension `n.pc` x n), `trunc.error` (optional truncation error with dimension n x m), `y.m` (vector mean removed before PCA with dimension m), `y.s` (vector sd scaled before PCA with dimension m). If `dat` is specified, `xx`, `y` and `n.pc` do not need to be specified.
`n.pc`	number of principal components to use
`perc.var`	optionally specify percent of variance to explain instead of n.pc
`n.cores`	integer number of cores (threads) to use
`parType`	either "fork" or "socket". Forking is typically faster, but not compatible with Windows. If `n.cores==1`, `parType` is ignored.
`center`	logical whether to subtract the mean before getting the principal components, or else a numeric vector of dimension m for the center to be used
`scale`	logical whether to divide by the standard deviation before getting the principal components, or else a numeric vector of dimension m for the scale to be used
`...`	arguements to be passed to `bass` function calls.

Details

Gets the PCA decomposition of the response y, and fits a bass model to each PCA basis coefficient, bass(dat$xx,dat$newy[i,],...) for i in 1 to n.pc, possibly in parallel.

Value

An object of class 'bassBasis' with two elements:

`mod.list`	list (of length `n.pc`) of individual bass models
`dat`	same as dat above

Examples

## Not run: 
  ## simulate data (Friedman function)
f<-function(x){
  10*sin(pi*x[,1]*x[,2])+20*(x[,3]-.5)^2+10*x[,4]+5*x[,5]
}
## simulate data (Friedman function with first variable as functional)
sigma<-.1 # noise sd
n<-500 # number of observations
nfunc<-50 # size of functional variable grid
xfunc<-seq(0,1,length.out=nfunc) # functional grid
x<-matrix(runif(n*9),n,9) # 9 non-functional variables, only first 4 matter
X<-cbind(rep(xfunc,each=n),kronecker(rep(1,nfunc),x)) # to get y
y<-matrix(f(X),nrow=n)+rnorm(n*nfunc,0,sigma)

## fit BASS
library(parallel)
mod<-bassPCA(x,y,n.pc=5,n.cores=min(5,parallel::detectCores()))
plot(mod$mod.list[[1]])
plot(mod$mod.list[[2]])
plot(mod$mod.list[[3]])
plot(mod$mod.list[[4]])
plot(mod$mod.list[[5]])

hist(mod$dat$trunc.error)

## prediction
npred<-100
xpred<-matrix(runif(npred*9),npred,9)
Xpred<-cbind(rep(xfunc,each=npred),kronecker(rep(1,nfunc),xpred))
ypred<-matrix(f(Xpred),nrow=npred)
pred<-predict(mod,xpred,mcmc.use=1:1000) # posterior predictive samples (each is a curve)
matplot(ypred,apply(pred,2:3,mean),type='l',xlab='observed',ylab='mean prediction')
abline(a=0,b=1,col=2)
matplot(t(ypred),type='l') # actual
matplot(t(apply(pred,2:3,mean)),type='l') # mean prediction

## sensitivity
sens<-sobolBasis(mod,int.order = 2,ncores = max(parallel::detectCores()-2,1),
                 mcmc.use=1000) # for speed, only use a few samples
plot(sens)


## calibration
x.true<-runif(9,0,1) # what we are trying to learn
yobs<-f(cbind(xfunc,kronecker(rep(1,nfunc),t(x.true)))) +
  rnorm(nfunc,0,.1) # calibration data (with measurement error)
plot(yobs)

cal<-calibrate.bassBasis(y=yobs,mod=mod,
                         discrep.mean=rep(0,nfunc),
                         discrep.mat=diag(nfunc)[,1:2]*.0000001,
                         sd.est=.1,
                         s2.df=50,
                         s2.ind=rep(1,nfunc),
                         meas.error.cor=diag(nfunc),
                         bounds=rbind(rep(0,9),rep(1,9)),
                         nmcmc=10000,
                         temperature.ladder=1.05^(0:30),type=1)

nburn<-5000
uu<-seq(nburn,10000,5)

pairs(rbind(cal$theta[uu,1,],x.true),col=c(rep(1,length(uu)),2),ylim=c(0,1),xlim=c(0,1))

pred<-apply(predict(mod,cal$theta[uu,1,],nugget = T,trunc.error = T,
      mcmc.use = cal$ii[uu]),3,function(x) diag(x)+rnorm(length(uu),0,sqrt(cal$s2[uu,1,1])))
qq<-apply(pred,2,quantile,probs=c(.025,.975))
matplot(t(qq),col='lightgrey',type='l')
lines(yobs,lwd=3)



## End(Not run)
## minimal example for CRAN testing
mod<-bassPCA(1:10,matrix(1:20,10),n.pc=2,nmcmc=2,nburn=1)

BASS documentation built on July 9, 2023, 6:57 p.m.