Home

/

CRAN

/

Rgbp

/

gbp: Fitting Gaussian, Poisson, and Binomial Hierarchical Models

gbp: Fitting Gaussian, Poisson, and Binomial Hierarchical Models
In Rgbp: Hierarchical Modeling and Frequency Method Checking on Overdispersed Gaussian, Poisson, and Binomial Data

Description Usage Arguments Details Value Author(s) References Examples

View source: R/gbp.R View source: R/gbp.R

gbp fits Bayesian hierarchical models using the Uniform distribution on the second level variance component (variance of the prior distribution), which enables good frequentist repeated sampling properties.

1
2
3

gbp(y, se.or.n, covariates, mean.PriorDist, model, intercept, 
           confidence.lvl, n.AR, n.AR.factor, trial.scale, save.result,
           normal.CI, t, u)

`y`	a (k by 1) vector of k groups' sample means for Gaussian or of each group's number of successful trials for Binomial and Poisson data, where k is the number of groups (or units) in a dataset.
`se.or.n`	a (k by 1) vector composed of the standard errors of all groups for Gaussian or of each group's total number of trials for Binomial and Poisson data.
`covariates`	(optional) a matrix of covariate(s) each column of which corresponds to one covariate.
`mean.PriorDist`	(optional) a numeric value for the second-level mean parameter, i.e. the mean of prior distribution, if you know this value a priori.
`model`	a character string indicating which hierarchical model to fit. "gaussian" for Gaussian data, "poisson" for Poisson, and "binomial" for Binomial. Default is "gaussian"
`intercept`	`TRUE` or `FALSE` flag indicating whether an intercept should be included in the regression. Default is `TRUE`.
`confidence.lvl`	a `float` between 0 and 1 to estimate 100*confidence.lvl% intervals. Default is 0.95.
`n.AR`	Only for Binomial model. If `n.AR = 1000`, all the results will be based on 1000 posterior samples using the acceptance-rejection sampling. Default is 0.
`n.AR.factor`	Only for Binomial model. If `n.AR = 1000` and `n.AR.factor = 4`, then the acceptance-rejection method will sample 4 * 1000 trial samples, and accept or reject them until the method gets n.AR posterior samples. Default is 4.
`trial.scale`	A scale used in the trial distribution of α. If resultant weight has too mamy 0's, scale should be smaller than before. If resultant weight has too few 0's, scale should be larger than before. If there are relatively huge weights, scale should be larger than before. Default is 1.3.
`save.result`	Only for Binomial model with the acceptance-rejection sampling. If `save.result` is TRUE, all the results of weights and posterior samples will be saved in the gbp object. Default is TRUE.
`normal.CI`	Only applicable for Gaussian data. If `TRUE` then a Normal approximation is used to construct intervals for the level 1 group means. If `FALSE` (default) then a Skew-Normal distribution is used. Setting the value to `TRUE` may result in speed improvements but may lead to intervals that under cover.
`t`	Non-negative constant to determine the hyper-prior distribution of r for the Binomial model with the acceptance-rejection method. If t is positive, then the hyper-prior distribution of r is proper, otherwise improper. dr/(t + r)^{u+1}.
`u`	A positive constant to determine the hyper-prior distribution of r for the Binomial model with the acceptance-rejection method. dr/(t + r)^{u+1}.

gbp fits a hierarchical model whose first-level hierarchy is a distribution of observed data and second-level is a conjugate prior distribution on the first-level parameter. To be specific, for Normal data, gbp constructs a two-level Normal-Normal multilevel model. V_j (=σ^2/n_j) is assumed to be known or to be accurately estimated, and subscript j indicates j-th group (or unit) in a dataset.

(y_j | θ_j) ~ indep N(θ_j, σ_j^2)

(θ_j | μ0_j, A) ~ indep N(μ0_j, A)

μ0_j = x_j'β

for j = 1, …, k, where k is the number of groups (units) in a dataset.

For Poisson data, gbp builds a two-level Poisson-Gamma multilevel model. A square bracket below indicates [mean, variance] of distribution, a constant multiplied to the notation representing Gamma distribution (Gam) is a scale, and y_j = z_j / n_j.

(z_j | θ_j) ~ indep Pois(n_jθ_j)

(θ_j | r, μ0_j) ~ indep Gam(rμ0_j) / r ~ indep Gam[μ0_j, μ0_j / r]

log(μ0_j) = x_j'β

for j = 1, …, k, where k is the number of groups (units) in a dataset.

For Binomial data, gbp sets a two-level Binomial-Beta multilevel model. A square bracket below indicates [mean, variance] of distribution and y_j = z_j / n_j.

(z_j | θ_j) ~ indep Bin(n_j, θ_j)

(θ_j | r, μ0_j) ~ indep Beta(rμ0_j, r(1 - μ0_j)) ~ indep Beta[μ0_j, μ0_j(1 - μ0_j) / (r + 1)]

logit(μ0_j) = x_j'β

for j = 1, …, k, where k is the number of groups (units) in a dataset.

For reference, based on the above notations, the Uniform prior distribution on the second level variance component (variance of the prior distribution) is dA for Gaussian and d(1 / r) (= dr / r^2) for Binomial and Poisson data. The second level variance component can be interpreted as variation among the first-level parameters (θ_j) or variance of ensemble information.

Under this setting, the argument y in gbp is a (k by 1) vector of k groups' sample means (y_j's in the description of Normal-Normal model above) for Gaussian or of each group's number of successful trials (z_j's) for Binomial and Poisson data, where k is the number of groups (or units) in a dataset.

The argument se.or.n is a (k by 1) vector composed of the standard errors (V_j's) of all groups for Gaussian or of each group's total number of trials (n_j's) for Binomial and Poisson data.

As for two optional arguments, covariates and mean.PriorDist, there are three feasible combinations of them to run gbp. The first situation is when we do not have any covariate and do not know a mean of the prior distribution (μ0) a priori. In this case, assigning none of two optional arguments, such as "gbp(z, n, model = "binomial")", will lead to a correct model. gbp will automatically fit a regression with only an intercept term to estimate a common mean of the prior distribution (exchangeability).

The second situation is when we have covariate(s) and do not know a mean of the prior distribution (μ0) a priori. In this case, assigning a matrix, X, each column of which corresponds to one covariate, such as "gbp(z, n, X, model = "poisson")", will lead to a correct model. Default of gbp is to fit a regression including an intercept term to estimate a mean of the prior distribution. Double exchangeability will hold in this case.

The last case is when we know a mean of the prior distribution (μ0) a priori. Now, we do not need to estimate regression coefficients at all because we know a true value of μ0 (strong assumption). Designating this value into the argument of gbp like "gbp(y, se, mean.PriorDist = 3)" is enough to account for it. For reference, mean.PriorDist has a stronger priority than covariates, which means that when both arguments are designated, gbp will fit a hierarchical model using the known mean of prior distribution, mean.PriorDist.

gbp returns an object of class "gbp" which provides three relevant functions plot, print, and summary.

An object of class gbp comprises of:

`sample.mean`	sample mean of each group (or unit)
`se`	if Gaussian data, standard error of sample mean in each group (or unit)
`n`	if Binomial and Poisson data, total number of trials of each group (or unit)
`prior.mean`	numeric if entered, NA if not entered
`prior.mean.hat`	estimate of prior mean by a regression if prior mean is not assigned a priori
`shrinkage`	shrinkage estimate of each group (adjusted posterior mean)
`sd.shrinkage`	posterior standard deviation of shrinkage
`post.mean`	posterior mean of each group
`post.sd`	posterior standard deviation of each group
`post.intv.low`	lower bound of 100*confidence.lvl% posterior interval (quantile of posterior distribution)
`post.intv.upp`	upper bound of 100*confidence.lvl% posterior interval (quantile of posterior distribution)
`model`	"gaussian" for Gaussian, "poisson" for Poisson, and "binomial" for Binomial data
`X`	a covariate vector or matrix if designated. NA if not
`beta.new`	regression coefficient estimates
`beta.var`	estimated variance matrix of regression coefficient
`intercept`	whether TRUE or FALSE
`a.new`	a posterior mode of α defined as log(A) for Gaussian or log(1 / r) for Binomial and Poisson data. Practical meaning (variation of ensemble information) of estimating α will appear in `summary(gbp.object)`.
`a.var`	posterior variance of α
`confidence.lvl`	confidence level based on which confidence interval is constructed
`weight`	(only for Binomial model) weights for acceptance-rejection method
`p`	(only for Binomial model) posterior samples of p based on the acceptance-rejection method, if this method was used. This is a (k by nsim) matrix. Each row contains posteiror samples of each random effect.
`alpha`	(only for Binomial model) posterior samples of alpha based on the acceptance-rejection method, if this method was used
`beta`	(only for Binomial model) posterior samples of beta based on the acceptance-rejection method, if this method was used
`accept.rate`	(only for Binomial model) the acceptance rate of the acceptance-rejection method, if this method was used
`n.AR`	(Only for Binomial model) If `n.AR = 1000`, all the results will be based on 1000 posterior samples using the acceptance-rejection sampling. Default is 0.
`n.AR.factor`	(only for Binomial model) If `n.AR = 1000` and `n.AR.factor = 4`, then the acceptance-rejection method will sample 4 * 1000 trial posteior samples, and accept or reject them. Default is 4.

Hyungsuk Tak, Joseph Kelly, and Carl Morris

Tak, H., Kelly, J., and Morris, C. (2017) Rgbp: An R Package for Gaussian, Poisson, and Binomial Random Effects Models with Frequency Coverage Evaluations. Journal of Statistical Software. 78, 5, 1–33.

Morris, C. and Lysy, M. (2012). Shrinkage Estimation in Multilevel Normal Models. Statistical Science. 27, 1, 115–134.

  # Loading datasets
  data(schools)
  y <- schools$y
  se <- schools$se

  # Arbitrary covariate for schools data
  x2 <- rep(c(-1, 0, 1, 2), 2)
  
  # baseball data where z is Hits and n is AtBats
  z <- c(18, 17, 16, 15, 14, 14, 13, 12, 11, 11, 10, 10, 10, 10, 10,  9,  8,  7)
  n <- c(45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45)

  # One covariate: 1 if a player is an outfielder and 0 otherwise
  x1 <- c(1,  1,  1,  1,  1,  0,  0,  0,  0,  1,  0,  0,  0,  1,  1,  0,  0,  0)

  ################################################################
  # Gaussian Regression Interactive Multilevel Modeling (GRIMM) #
  ################################################################

    ###################################################################################
    # If we do not have any covariate and do not know a mean of the prior distribution#
    ###################################################################################

    g <- gbp(y, se, model = "gaussian")
    g
    print(g, sort = FALSE)
    summary(g)
    plot(g)
    plot(g, sort = FALSE)

    ### when we want to simulate pseudo datasets considering the estimated values 
    ### as true ones.
    gcv <- coverage(g, nsim = 10)  

    ### gcv$coverageRB, gcv$coverage10, gcv$average.coverageRB, gcv$average.coverage10,
    ### gcv$minimum.coverageRB, gcv$minimum.coverage10, gcv$raw.resultRB, 
    ### gcv$raw.result10.

    ### when we want to simulate pseudo datasets based on different values of A
    ### and a regression coefficient (intercept), 
    ### not using estimated values as true ones.
    gcv <- coverage(g, A.or.r = 9, reg.coef = 10, nsim = 10)  

    ##################################################################################
    # If we have one covariate and do not know a mean of the prior distribution yet, #
    ##################################################################################

    g <- gbp(y, se, x2, model = "gaussian")
    g
    print(g, sort = FALSE)
    summary(g)
    plot(g)
    plot(g, sort = FALSE)

    ### when we want to simulate pseudo datasets considering the estimated values 
    ### as true ones.
    gcv <- coverage(g, nsim = 10)  
 
    ### gcv$coverageRB, gcv$coverage10, gcv$average.coverageRB, gcv$average.coverage10,
    ### gcv$minimum.coverageRB, gcv$minimum.coverage10, gcv$raw.resultRB, 
    ### gcv$raw.result10.

    ### when we want to simulate pseudo datasets based on different values of A
    ### and regression coefficients, not using estimated values 
    ### as true ones. Two values of reg.coef are for beta0 and beta1
    gcv <- coverage(g, A.or.r = 9, reg.coef = c(10, 1), nsim = 10)  

    ################################################
    # If we know a mean of the prior distribution, #
    ################################################

    g <- gbp(y, se, mean.PriorDist = 8, model = "gaussian")
    g
    print(g, sort = FALSE)
    summary(g)
    plot(g)
    plot(g, sort = FALSE)

    ### when we want to simulate pseudo datasets considering the estimated values 
    ### as true ones.
    gcv <- coverage(g, nsim = 10)  

    ### gcv$coverageRB, gcv$coverage10, gcv$average.coverageRB, gcv$average.coverage10,
    ### gcv$minimum.coverageRB, gcv$minimum.coverage10, gcv$raw.resultRB, 
    ### gcv$raw.result10.

    ### when we want to simulate pseudo datasets based on different values of A and
    ### 2nd level mean as true ones, not using estimated values as true ones.
    coverage(g, A.or.r = 9, mean.PriorDist = 5, nsim = 10)  

  ###############################################################
  # Binomial Regression Interactive Multilevel Modeling (BRIMM) #
  ###############################################################

    ###################################################################################
    # If we do not have any covariate and do not know a mean of the prior distribution#
    ###################################################################################

    b <- gbp(z, n, model = "binomial")
    b
    print(b, sort = FALSE)
    summary(b)
    plot(b)
    plot(b, sort = FALSE)

    ### when we want to simulate pseudo datasets considering the estimated values 
    ### as true ones.
    bcv <- coverage(b, nsim = 10)  

    ### bcv$coverageRB, bcv$coverage10, bcv$average.coverageRB, bcv$average.coverage10,
    ### bcv$minimum.coverageRB, bcv$minimum.coverage10, bcv$raw.resultRB, 
    ### bcv$raw.result10.

    ### when we want to simulate pseudo datasets based on different values of r
    ### and a regression coefficient (intercept), 
    ### not using estimated values as true ones.
    bcv <- coverage(b, A.or.r = 60, reg.coef = -1, nsim = 10)  

    ##################################################################################
    # If we have one covariate and do not know a mean of the prior distribution yet, #
    ##################################################################################

    b <- gbp(z, n, x1, model = "binomial")
    b
    print(b, sort = FALSE)
    summary(b)
    plot(b)
    plot(b, sort = FALSE)

    ### when we want to simulate pseudo datasets considering the estimated values 
    ### as true ones.
    bcv <- coverage(b, nsim = 10)  

    ### bcv$coverageRB, bcv$coverage10, bcv$average.coverageRB, bcv$average.coverage10,
    ### bcv$minimum.coverageRB, bcv$minimum.coverage10, bcv$raw.resultRB, 
    ### bcv$raw.result10.

    ### when we want to simulate pseudo datasets based on different values of r
    ### and regression coefficients, not using estimated values 
    ### as true ones. Two values of reg.coef are for beta0 and beta1
    bcv <- coverage(b, A.or.r = 60, reg.coef = c(-1, 0), nsim = 10)  

    ################################################
    # If we know a mean of the prior distribution, #
    ################################################

    b <- gbp(z, n, mean.PriorDist = 0.265, model = "binomial")
    b
    print(b, sort = FALSE)
    summary(b)
    plot(b)
    plot(b, sort = FALSE)

    ### when we want to simulate pseudo datasets considering the estimated values 
    ### as true ones.
    bcv <- coverage(b, nsim = 10)  

    ### bcv$coverageRB, bcv$coverage10, bcv$average.coverageRB, bcv$average.coverage10,
    ### bcv$minimum.coverageRB, bcv$minimum.coverage10, bcv$raw.resultRB, 
    ### bcv$raw.result10.

    ### when we want to simulate pseudo datasets based on different values of r and
    ### 2nd level mean as true ones, not using estimated values as true ones.
    bcv <- coverage(b, A.or.r = 60, mean.PriorDist = 0.3, nsim = 10)  

  ##############################################################
  # Poisson Regression Interactive Multilevel Modeling (PRIMM) #
  ##############################################################

    ################################################
    # If we know a mean of the prior distribution, #
    ################################################

    p <- gbp(z, n, mean.PriorDist = 0.265, model = "poisson")
    p
    print(p, sort = FALSE)
    summary(p)
    plot(p)
    plot(p, sort = FALSE)

    ### when we want to simulate pseudo datasets considering the estimated values 
    ### as true ones.
    pcv <- coverage(p, nsim = 10)  

    ### pcv$coverageRB, pcv$coverage10, pcv$average.coverageRB, pcv$average.coverage10,
    ### pcv$minimum.coverageRB, pcv$minimum.coverage10, pcv$raw.resultRB, 
    ### pcv$raw.result10.

    ### when we want to simulate pseudo datasets based on different values of r and
    ### 2nd level mean as true ones, not using estimated values as true ones.
    pcv <- coverage(p, A.or.r = 60, mean.PriorDist = 0.3, nsim = 10)

Rgbp documentation built on Dec. 18, 2019, 1:41 a.m.

Rgbp index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Rgbp
Hierarchical Modeling and Frequency Method Checking on Overdispersed Gaussian, Poisson, and Binomial Data

gbp: Fitting Gaussian, Poisson, and Binomial Hierarchical Models
In Rgbp: Hierarchical Modeling and Frequency Method Checking on Overdispersed Gaussian, Poisson, and Binomial Data

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to gbp in Rgbp...

R Package Documentation

Browse R Packages

We want your feedback!

Rgbp Hierarchical Modeling and Frequency Method Checking on Overdispersed Gaussian, Poisson, and Binomial Data

gbp: Fitting Gaussian, Poisson, and Binomial Hierarchical Models In Rgbp: Hierarchical Modeling and Frequency Method Checking on Overdispersed Gaussian, Poisson, and Binomial Data

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to gbp in Rgbp...

R Package Documentation

Browse R Packages

We want your feedback!

Rgbp
Hierarchical Modeling and Frequency Method Checking on Overdispersed Gaussian, Poisson, and Binomial Data

gbp: Fitting Gaussian, Poisson, and Binomial Hierarchical Models
In Rgbp: Hierarchical Modeling and Frequency Method Checking on Overdispersed Gaussian, Poisson, and Binomial Data