partition: Hierarchical Partitioning from a List of Goodness of Fit...
In cjbwalsh/hier.part: Hierarchical Partitioning

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/hier.part.R

Partitions variance in a multivariate dataset from a list of goodness of fit measures

1	partition(gfs, pcan, var.names = NULL)

`gfs`	an array as outputted by the function `all.regs` or a vector of goodness of fit measures from a hierarchy of regressions based on pcan variables in ascending order (as produced by function `combos`, but also including the null model as the first element)
`pcan`	the number of variables from which the hierarchy was constructed (maximum = 12)
`var.names`	an array of pcan variable names, if required

This function applies the hierarchical partitioning algorithm of Chevan and Sutherland (1991) to return a simple table listing of each variable, its independent contribution (I) and its conjoint contribution with all other variables (J). The output is identical to the function hier.part, which takes the dependent and independent variable data as its input.

Note earlier versions of partition (hier.part<1.0) produced a matrix and barplot of percentage distribution of effects as a percentage of the sum of all Is and Js, as portrayed in Hatt et al. (2004) and Walsh et al. (2004). This version plots the percentage distribution of independent effects only. The sum of Is equals the goodness of fit measure of the full model minus the goodness of fit measure of the null model.

The distribution of joint effects shows the relative contribution of each variable to shared variability in the full model. Negative joint effects are possible for variables that act as suppressors of other variables (Chevan and Sutherland 1991).

At this stage, the partition routine will not run for more than 12 independent variables.

a list containing

`gfs`	a data frame listing all combinations of predictor variables in the first column in ascending order, and the corresponding goodness of fit measure for the model using those variables
`IJ`	a data frame of I, the independent and J the joint contribution for each predictor variable
`I.perc`	a data frame of I as a percentage of total explained variance
`J.perc`	a data frame of J as a percentage of sum of all Js

The function produces a minor rounding error for hierarchies constructed from more than 9 variables. To check if this error affects the inference from an analysis, run the analysis several times with the variables entered in a different order. There are no known problems for hierarchies with 9 or fewer variables.

Chris Walsh cwalsh@unimelb.edu.au using c and fortran code written by Ralph Mac Nally Ralph.MacNally@gmail.com.

Chevan, A. and Sutherland, M. 1991 Hierarchical Partitioning. The American Statistician 45, 90–96.

Hatt, B. E., Fletcher, T. D., Walsh, C. J. and Taylor, S. L. 2004 The influence of urban density and drainage infrastructure on the concentrations and loads of pollutants in small streams. Environmental Management 34, 112–124.

all.regs, partition, rand.hp

    #linear regression of log(electrical conductivity) in streams
    #against seven independent variables describing catchment
    #characteristics (from Hatt et al. 2004).

    data(urbanwq)
    env <- urbanwq[,2:8]
    gofs <- all.regs(urbanwq$lec, env, fam = "gaussian",
    gof = "Rsqu", print.vars = TRUE)
    partition(gofs, pcan = 7, var.names = names(urbanwq[,2,8]))

    #hierarchical partitioning of logistic and linear regression
    #goodness of fit measures from Chevan and Sutherland (1991).

    data(chevan)
    partition(chevan$chisq, pcan = 4)
    partition(chevan$rsqu, pcan = 4)