R/gower.R
In goweragreement: Bayesian Gower Agreement for Categorical Data

Documented in confint.gower discrete.dist gower.agree hpd influence.gower L1.dist summary.gower

is.wholenumber = function(x, tol = .Machine$double.eps^0.5)
{
    abs(x - round(x)) < tol
}

#' Compute a highest posterior density (HPD) interval.
#'
#' @details This function uses a given posterior sample to compute an HPD interval at a given significance level.
#'
#' @param x the posterior sample.
#' @param alpha the desired significance level.
#'
#' @return A 2-vector containing the lower endpoint and the upper endpoint, respectively.
#'
#' @export

hpd = function(x, alpha = 0.05)
{
    n = length(x)
    m = round(n * alpha)
    x = sort(x)
    y = x[(n - m + 1):n] - x[1:m]
    z = min(y)
    k = which(y == z)[1]
    c(x[k], x[n - m + k])
}

#' Apply the discrete metric to two scores.
#'
#' @details This function applies the discrete metric to two scores. This is an appropriate distance function for nominal data.
#'
#' @param x a score.
#' @param y a score.
#'
#' @return 0 if \code{x} is equal to \code{y}, 1 if \code{x} is not equal to \code{y}, \code{NA} if either score is \code{NA}.
#'
#' @seealso \code{\link{L1.dist}}
#'
#' @export

discrete.dist = function(x, y)
{
    d = as.numeric(x != y)
    d
}

#' Apply the L1 distance function to two scores.
#'
#' @details This function applies the L1 distance function to two scores. This is an appropriate distance function for ordinal data.
#'
#' @param x a score.
#' @param y a score.
#' @param range the range of the scores, i.e., the difference between the maximum score and the minimum score.
#'
#' @return \eqn{|x - y| / r}, where \eqn{r} is the range of the scores; or \code{NA} if either score is \code{NA}.
#'
#' @seealso \code{\link{discrete.dist}}
#'
#' @export

L1.dist = function(x, y, range)
{
    d = as.numeric(abs(x - y) / range)
    d
}

# The following function computes all pairwise distances for a given row, i.e., unit, in the dataset.
# The distances are returned in a vector. NAs are removed.

row.dist = function(x, dist, ...)
{
    d = NULL
    n = length(x)
    for (i in 1:(n - 1))
        for (j in (i + 1):n)
            d = c(d, dist(x[i], x[j], ...))
    d = d[! is.na(d)]
    d
}

# The following function computes all of the row statistics for a given dataset, using the mean distance for
# each row. The row statistics are returned in a vector. NAs are removed. This function is used when the data
# are nominal, or when the data are ordinal and dist.type is equal to "mean".

row.stat = function(data, dist, ...)
{
    n = nrow(data)
    g = numeric(n)
    for (i in 1:n)
        g[i] = 1 - mean(row.dist(data[i, ], dist, ...))
    g = g[! is.na(g)]
    g
}

# The following function computes all of the row statistics for a given dataset, using the maximum distance for
# each row. The row statistics are returned in a vector. NAs are removed. This function is used when the data
# are ordinal and dist.type is equal to "max".

row.stat.max = function(data, dist, ...)
{
    n = nrow(data)
    g = numeric(n)
    for (i in 1:n)
        g[i] = 1 - max(row.dist(data[i, ], dist, ...))
    g = g[! is.na(g)]
    g
}
    
#' Apply the Bayesian Gower agreement methodology to nominal or ordinal data.
#'
#' @details This is the package's flagship function. It applies the Bayesian Gower methodology to nominal or ordinal data, and provides an estimate of the posterior mean along with a credible interval.
#'
#' 
#'
#' @param data a matrix of scores. Each row corresponds to a unit, each column to a coder.
#' @param data.type the type of scores to be analyzed, either \code{"nominal"} or \code{"ordinal"}.
#' @param dist.type for ordinal data, whether the row statistics are computed using the mean of the pairwise distances or the maximum pairwise distance.
#' @param design the sampling design, either \code{"one-way"} or \code{"two-way"}. For the former, the units are random and the coders are fixed. For the latter, both the units and the coders are random.
#' @param iter the desired size of the posterior sample. The default value is 10,000.
#' @param \dots additional arguments for the distance function. These are ignored for nominal data. For ordinal data the range of the scores must be provided via argument \code{range}.  
#'
#' @return Function \code{gower.agree} returns an object of class \code{"gower"}, which is a list comprising the following elements.
#'         \item{mu.hat}{the estimate of the posterior mean.}
#'         \item{mu.sample}{the posterior sample.}
#'         \item{call}{the matched call.}
#'         \item{units}{the number of units.}
#'         \item{coders}{the number of coders.}
#'         \item{data}{the data matrix, sans rows that were removed due to missigness.}
#'         \item{data.type}{the type of scores, nominal or ordinal.}
#'         \item{dist.type}{for ordinal data, the manner in which the row statistics were computed.}
#'         \item{design}{the sampling design, one-way or two-way.}
#'         \item{row.stats}{the vector of row statistics.}
#'         \item{del}{the number of rows that were deleted due to missingness.}
#'
#' @export
#'
#' @examples
#'
#' # Fit the liver data, using the mean distance for each row of the data matrix.
#' # The range (which is equal to 4) must be passed to gower.agree since these
#' # data are ordinal and the L1 distance function is used. We assume a one-way
#' # sampling design for these data, i.e., units are random and coders are fixed.
#'
#' data(liver)
#' liver = as.matrix(liver)
#' fit = gower.agree(liver, data.type = "ordinal", range = 4)
#' summary(fit)

gower.agree = function(data, data.type = c("nominal", "ordinal"), dist.type = c("mean", "max"),
                       design = c("one-way", "two-way"), iter = 10000, ...)
{
    call = match.call()
    if (missing(data) || ! is.matrix(data) || ! is.numeric(data))
        stop("You must supply a numeric data matrix.")
    n = nrow(data)
    m = ncol(data)
    del = NULL
    for (i in 1:n)
    {
        present = sum(! is.na(data[i, ]))
        if (present < 2)
            del = c(del, i)
    }
    if (length(del) > 0)
        data = data[-del, ]
    n = nrow(data)
    data.type = match.arg(data.type)
    if (data.type == "nominal")
        dist = discrete.dist
    else
        dist = L1.dist
    dist.type = match.arg(dist.type)
    design = match.arg(design)
    if (! is.wholenumber(iter) || iter < 1)
    {
        warning("Argument 'iter' must be a positive integer. Using the default value of 10,000.")
        iter = 10000
    }
    if (design == "one-way")
    {
        if (dist.type == "mean")
            g = row.stat(data, dist, ...)
        else
            g = row.stat.max(data, dist, ...)
        mu.sample = numeric(iter)
        for (j in 1:iter)
        {
            u = runif(n - 1)
            w = diff(c(0, sort(u), 1))
            mu.sample[j] = w %*% g
        }
    }
    else
    {
        mu.sample = numeric(iter)
        for (j in 1:iter)
        {
            rows = sample(1:n, n)
            dat = data[rows, ]
            cols = sample(1:m, m)
            dat = dat[, cols]
            if (dist.type == "mean")
                g = row.stat(data, dist, ...)
            else
                g = row.stat.max(data, dist, ...)
            u = runif(n - 1)
            w = diff(c(0, sort(u), 1))
            mu.sample[j] = w %*% g
        }
        if (dist.type == "mean")
            g = row.stat(data, dist, ...)
        else
            g = row.stat.max(data, dist, ...)
    }
    mu.hat = mean(mu.sample)
    names(mu.hat) = "mu"
    object = list(call = call, units = n, coders = m, data = data, data.type = data.type, row.stats = g,
                  dist.type = dist.type, design = design, mu.hat = mu.hat, mu.sample = mu.sample, del = length(del))
    class(object) = "gower"
    object
}

#' Compute DFBETAs for units and/or coders.
#'
#' @details This function computes DFBETAs for one or more units and/or one or more coders.
#'
#' @param model a fitted model object, the result of a call to \code{\link{gower.agree}}.
#' @param units a vector of integers. A DFBETA will be computed for each of the corresponding units.
#' @param coders a vector of integers. A DFBETA will be computed for each of the corresponding coders.
#' @param ... additional arguments. These are ignored.
#
#' @return A list comprising at most four elements.
#'         \item{dfbeta.units}{a vector containing DFBETAs for the units specified via argument \code{units}.}
#'         \item{dfbeta.coders}{a vector containing DFBETAs for the coders specified via argument \code{coders}.}
#'         \item{fits.units}{a list containing fit objects for the omitted units specified via argument \code{units}.}
#'         \item{fits.coders}{a list containing fit objects for the omitted coders specified via argument \code{coders}.}
#'
#' @method influence gower
#'
#' @export
#'
#' @examples
#'
#' # Analyze nominal data previously considered by Krippendorff.
#' # Assume a one-way design. Compute a DFBETA for unit 6, which
#' # should be rather influential.
#'
#' kripp = matrix(c(1,2,3,3,2,1,4,1,2,NA,NA,NA,
#'                  1,2,3,3,2,2,4,1,2,5,NA,3,
#'                  NA,3,3,3,2,3,4,2,2,5,1,NA,
#'                  1,2,3,3,2,4,4,1,2,5,1,NA), 12, 4)
#' kripp
#'
#' set.seed(12)
#' fit = gower.agree(kripp)
#' summary(fit)
#' inf = influence(fit, units = 6)
#'
#' # Report the DFBETA for unit 6 and the estimate of mu when unit 6
#' # is ommitted.
#'
#' inf$dfbeta.units
#' fit$mu.hat - inf$dfbeta.units

influence.gower = function(model, units, coders, ...)
{
    mod.call = model$call
    dfbeta.units = NULL
    fits.units = NULL
    if (! missing(units))
    {
        dfbeta.units = numeric(length(units))
        k = 1
        for (j in units)
        {
            mod.call$data = model$data[-j, ]
            fit = eval(mod.call)
            fits.units[[k]] = fit
            dfbeta.units[k] = model$mu.hat - fit$mu.hat
            k = k + 1
        }
        names(dfbeta.units) = units
    }
    dfbeta.coders = NULL
    fits.coders = NULL
    if (! missing(coders))
    {
        dfbeta.coders = numeric(length(coders))
        k = 1
        for (j in coders)
        {
            mod.call$data = model$data[, -j]
            fit = eval(mod.call)
            fits.coders[[k]] = fit
            dfbeta.coders[k] = model$mu.hat - fit$mu.hat
            k = k + 1
        }
        names(dfbeta.coders) = coders
    }
    result = list()
    if (! is.null(dfbeta.units))
    {
        result$dfbeta.units = dfbeta.units
        result$fits.units = fits.units
    }
    if (! is.null(dfbeta.coders))
    {
        result$dfbeta.coders = dfbeta.coders
        result$fits.coders = fits.coders
    }
    result        
}

#' Print a summary of a Bayesian Gower fit.
#'
#' @details This function prints a summary of the fit.
#'
#' @param object an object of class \code{"gower"}, the result of a call to \code{\link{gower.agree}}.
#' @param conf.level the confidence level for the credible interval. The default is 0.95.
#' @param digits the number of significant digits to display. The default is 4.
#' @param \dots additional arguments. These are currently ignored.
#'
#' @return A list containing various elements of the summary is returned invisibly, but this function should be called for its side effects.
#'
#' @seealso \code{\link{gower.agree}}
#'
#' @method summary gower
#'
#' @export
#'
#' @examples
#'
#' # Fit the liver data, using the mean distance for each row of the data matrix.
#' # The range (which is equal to 4) must be passed to gower.agree since these
#' # data are ordinal and the L1 distance function is used. We assume a one-way
#' # sampling design for these data, i.e., units are random and coders are fixed.
#'
#' data(liver)
#' liver = as.matrix(liver)
#' fit = gower.agree(liver, data.type = "ordinal", range = 4)
#' summary(fit)

summary.gower = function(object, conf.level = 0.95, digits = 4, ...)
{
    cat("\nBayesian Gower agreement\n========================\n")
    cat("\nData:", object$units, "units x", object$coders, "coders\n")
    ans = NULL
    ans$units = object$units
    ans$coders = object$coders
    if (object$del > 0)
        cat("\n", object$del, " rows removed due to missingness\n", sep = "")
    cat("\nData type:", object$data.type, "\n")
    ans$data.type = object$data.type
    if (object$data.type == "ordinal")
    {
        cat("\nDistance type:", object$dist.type, "\n")
        ans$dist.type = object$dist.type
    }
    cat("\nSampling design:", object$design, "\n")
    ans$design = object$design
    cat("\nPosterior sample size:", length(object$mu.sample), "\n")
    cat("\nCall:\n\n")
    print(object$call)
    ans$call = object$call
    coef.table = cbind(object$mu.hat, t(confint(object, level = conf.level)))
    colnames(coef.table) = c("Estimate", "Lower", "Upper")
    rownames(coef.table) = names(object$mu.hat)
    ans$coef.table = coef.table
    cat("\nResults:\n\n")
    print(signif(coef.table, digits = digits))
    cat("\n")
    invisible(ans)
}

#' Compute a credible interval for a Bayesian Gower fit.
#'
#' @details This function computes a credible interval for mu, the agreement parameter.
#'
#' This function uses the bootstrap sample to compute a credible interval. If \code{type = "HPD"} is not an element of \dots, the lower and upper limits of the interval are appropriate quantiles of the bootstrap sample. Otherwise the limits are for an HPD interval.
#'
#' @param object an object of class \code{"gower"}, the result of a call to \code{\link{gower.agree}}.
#' @param parm always ignored since there is only one parameter, mu.
#' @param level the desired confidence level for the interval. The default is 0.95.
#' @param \dots additional arguments. These are passed to \code{\link{quantile}}.
#'
#' @return A vector with entries giving the lower and upper limits of the interval. These will be labelled as (1 - level) / 2 and 1 - (1 - level) / 2 if the quantile method was used. They will be labeled "Lower" and "Upper" if the interval is an HPD interval.
#'
#' @seealso \code{\link{gower.agree}}
#'
#' @method confint gower
#'
#' @export
#'
#' @examples
#'
#' # Fit the liver data, using the mean distance for each row of the data matrix.
#' # The range (which is equal to 4) must be passed to gower.agree since these data
#' # are ordinal and the L1 distance function is used. We assume a one-way sampling
#' # design for these data, i.e., units are random and coders are fixed. Also compute
#' # a 95% credible interval using the quantile method and then the HPD method.
#'
#' data(liver)
#' liver = as.matrix(liver)
#' fit = gower.agree(liver, data.type = "ordinal", range = 4)
#' confint(fit)
#' confint(fit, type = "HPD")

confint.gower = function(object, parm = "mu", level = 0.95, ...)
{
    n = object$units
    args = list(...)
    if ("type" %in% names(args) && args$type == "HPD")
    {
        ci = hpd(object$mu.sample, 1 - level)
        names(ci) = c("Lower", "Upper")
    }
    else
    {
        alpha = pnorm(sqrt(n / (n - 1)) * qt((1 - level) / 2, df = n - 1))
        ci = quantile(object$mu.sample, c(alpha, 1 - alpha))
        names(ci) = c((1 - level) / 2, 1 - (1 - level) / 2)
    }
    ci
}

Any scripts or data that you put into this service are public.

goweragreement documentation built on Sept. 11, 2024, 7:20 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

goweragreement
Bayesian Gower Agreement for Categorical Data

R/gower.R
In goweragreement: Bayesian Gower Agreement for Categorical Data

Defines functions confint.gower summary.gower influence.gower gower.agree row.stat.max row.stat row.dist L1.dist discrete.dist hpd is.wholenumber

Documented in confint.gower discrete.dist gower.agree hpd influence.gower L1.dist summary.gower

Try the goweragreement package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

goweragreement Bayesian Gower Agreement for Categorical Data

R/gower.R In goweragreement: Bayesian Gower Agreement for Categorical Data

Defines functions confint.gower summary.gower influence.gower gower.agree row.stat.max row.stat row.dist L1.dist discrete.dist hpd is.wholenumber

Documented in confint.gower discrete.dist gower.agree hpd influence.gower L1.dist summary.gower

Try the goweragreement package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

goweragreement
Bayesian Gower Agreement for Categorical Data

R/gower.R
In goweragreement: Bayesian Gower Agreement for Categorical Data