crossvalidation: Loss Calculation by Cross Validation

Description Usage Arguments Details Value See Also Examples

Description

Function here are to calculate the loss by cross validation for Bayesian hierarchical model (see also Hier) and Bayesian model with Ising prior (see also Ising). This can be used to select the best hyperparameters and to compare two models.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
Lossfun(aedata, PI)

kfdpar(adsl, adae, k)

CVhier(AElist, n_burn, n_iter, thin, n_adapt, n_chain, alpha.gamma = 3,
  beta.gamma = 1, alpha.theta = 3, beta.theta = 1,
  mu.gamma.0.0 = 0, tau.gamma.0.0 = 0.1, alpha.gamma.0.0 = 3,
  beta.gamma.0.0 = 1, lambda.alpha = 0.1, lambda.beta = 0.1,
  mu.theta.0.0 = 0, tau.theta.0.0 = 0.1, alpha.theta.0.0 = 3,
  beta.theta.0.0 = 1)

CVising(AElist, n_burn, n_iter, thin, alpha_ = 0.25, beta_ = 0.75,
  alpha.t = 0.25, beta.t = 0.75, alpha.c = 0.25, beta.c = 0.75,
  rho, theta)

Arguments

aedata

output from function preprocess

PI

output from function Hiergetpi or Isinggetpi

k

interger, the number of folds used to split the dataset for cross validation

n_burn

number of burn in for Gibbs Sampling

n_iter

number of interation for Gibbs Sampling

thin

thin for Gibbs Samping, parameters are recorded every thin-th interation

n_adapt

integer, number of adaptations

n_chain

number of MCMC chains

alpha_

numeric, is the prior for beta distribution, beta distribution for both treatment and control group, alpha parameter of beta distribution

beta_

numeric, is the prior for beta distribution, beta distribution for both treatment and control group, beta parameter of beta distribution

alpha.t

numeric, is the prior for beta distribution, beta distribution for treatment group, alpha parameter of beta distribution

beta.t

numeric, is the prior for beta distribution, beta distribution for treatment group, beta parameter of beta distribution

alpha.c

numeric, is the prior for beta distribution, beta distribution for control group, alpha parameter of beta distribution

beta.c

numeric, is the prior for beta distribution, beta distribution for control group, beta parameter of beta distribution

rho

either a number or numeric vector with length equals to the number of rows of data frame aedata. If it is a single number, then all adverse events use the same hyperparameter of rho. If it is a numeric vector, then each AE has its own hyperparameter of rho, and the sequence of rho value for each AE should be the same as the sequence of AE in aedata (AE in aedata should be ordered by b and j).

theta

numeric, rho and theta are parameters for Ising prior

Details

The loss is calcuated by:

√{∑_{bj} [(Y_{bj}-N_t*t_{bj})^2]}/N_t + √{∑_{bj} [(X_{bj}-N_c*c_{bj})^2]}/N_c

Here b=1,..., B and j=1, ... , k_b, Y_bj and X_bj are the number of subjects with an AE with PT j under SOC b in treatment and control groups. N_t and N_c are the number of subjects in treatment and control groups, respectively. t_bj and c_bj are the model fitted incidence of an AE with PT j under SOC b in treatment and control groups. This formular gives the loss for one interaction/sample, the final loss is the average of loss from all of the interactions/samples.

The loss is calcuated in following way: first the subjects original AE dataset (output of preprocess) is randomly evenly divided k independent subparts. For each subpart, use this subpart as the testing dataset and use the rest of the whole dataset as the training dataset. Model is trained with the training dataset and then loss is calculated for the testing dataset and training dataset. Repeat this for each subpart and take the average of the testing loss and training loss from each subpart as the final loss.

Lossfun takes the AE dataset and fitted incidence as parameters and calculate the loss based on the loss function above.

kfdpar first calls function preprocess to process the data and produce a temporary dataset and also calls function preprocess to process the data to get the whole AE dataset. Then this temporary dataset will be randomly divided into k equal subparts. For each subpart, use this subpart as the testing dataset and use the rest of the whole dataset as the training dataset.This function will generate a list with k elements with each element is a also a list a list contains two elements, named traindf and testdf. "traindf" is used to train the model and testdf is usesd to calcualte the loss. The output is going to be used for further crossvalidation to calculate loss.

CVhier calculates the loss for Bayesian Hierarchical model.

CVising calculates the los for Bayesian model with Ising prior.

Value

Lossfun returns the loss for dataset aedata based on the fitted incidence PI.
kfdpar returns a list with k elements with each element is a also a list, that contains two elements, named traindf and testdf.
CVhier returns the final training and testing loss for Bayesian hierarchical model.
CVIsing returns the final training and testing loss for Bayesian model with Ising model.

See Also

preprocess, Hier, Ising, Isinggetpi, Hiergetpi

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
## Not run: 
data(ADAE)
data(ADSL)
AEdata<-preprocess(adsl=ADSL, adae=ADAE)
AELIST<-kfdpar(ADSL, ADAE, k=5)

# Bayesian Hierarchical Model
HIERRAW<-Hier_history(aedata=AEdata, n_burn=1000, n_iter=1000, thin=20, n_adapt=1000, n_chain=2)
HIERPI<-Hiergetpi(aedata=AEdata, hierraw=HIERRAW)
loss_1<-Lossfun(aedata=AEdata, PI=HIERPI)
LOSSHIER<-CVhier(AElist=AELIST, n_burn=1000, n_iter=1000, thin=20, n_adapt=1000, n_chain=2)
LOSSHIER$trainloss # train loss
LOSSHIER$testloss # test loss

# Bayesian model with Ising prior
ISINGRAW<-Ising_history(aedata = AEdata, n_burn=1000, n_iter=5000, thin=20, alpha_=0.5, beta_=0.75, alpha.t=0.5, beta.t=0.75,
                                   alpha.c=0.25, beta.c=0.75, rho=1, theta=0.02)
ISINGPI<-Isinggetpi(aedata = AEdata, isingraw=ISINGRAW)
loss_2<-Lossfun(aedata=AEdata, PI=ISINGPI)

LOSSISING<-CVising(AElist=AELIST, n_burn=100, n_iter=500, thin=20, alpha_=0.5, beta_=0.75, alpha.t=0.5, beta.t=0.75,
                             alpha.c=0.25, beta.c=0.75, rho=1, theta=0.02)
LOSSISING$trainloss # train loss
LOSSISING$testloss # test loss

## End(Not run)

ganluan123/FlagAE documentation built on Nov. 4, 2019, 1:02 p.m.