up: Create a data frame at a higher level of aggregation

Description Usage Arguments Details Value Examples

Description

Produce a higher level data set with one row per cluster. The data set can contain only variables that are invariant in each cluster or it can also include summaries (mean or modes) of variables that vary by cluster. Adapted from gsummary in the nlme package.

Usage

1
up(object, form = formula(object), all = FALSE, FUN = function(x) mean(x, na.rm = TRUE), omitGroupingFactor = FALSE, groups, invariantsOnly = !all, ...)

Arguments

object

a data frame to be summarized.

form

a one-sided formula identifying the variable(s) in object that identifies clusters. e.g. ~ school/Sex to get a summary within each Sex of each school.

all

if TRUE, include summaries of variables that vary within clusters, otherwise keep only cluster-invariant variables.

sep

separator to form cluster names combining more than one clustering variables. If the separator leads to the same name for distinct clusters (e.g. if var1 has levels 'a' and 'a/b' and var2 has levels 'b/c' and 'c') the function produces an error and a different separator should be used.

FUN

function to be used for summaries.

omitGroupingFactor

kept for compatibility with gsummary

groups

kept for compatibility with gsummary

invariantsOnly

kept for compatibility with gsummary

...

additional arguments to tapply when summarizing numerical variables. e.g. na.rm = TRUE

Details

up was created from nlme::gsummary and modified to make it easier to use and to make an equivalent of gsummary available when using lme4.

Value

a data frame with one row per value of the variable in form

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

    data(hs)
    dim( hs )
    hsu <- up( hs, ~ school )
    dim( hsu )
    
    # to also get cluster means of cluster-varying numeric variables and modes of factors:

    hsa <- up( hs, ~ school , all = TRUE )

    # to get summary proportions of cluster varying factors:

    up( cbind( hs, model.matrix( ~ Sex -1 , hs)), ~ school, all = T)


    ## To plot a summary between-cluster panel along with within-cluster panels:

    hsu <- up( hs, ~ school, all = TRUE)
    hsu$school <- ' between'  # space to make it come lexicographically before cluster names

    require( lattice )
    xyplot( mathach ~ ses | school, rbind(hs,hsu),
        panel = function( x, y, ...) {
            panel.xyplot( x, y, ...)
            panel.lmline( x, y, ...)
        } )

gmonette/spida15 documentation built on May 17, 2019, 7:26 a.m.