Means: Means for groups of observations

View source: R/Means.R

MeansR Documentation

Means for groups of observations

Description

The function Means() creates a table of group means, optionally with standard errors, confidence intervals, and numbers of valid observations.

Usage

Means(data, ...)
## S3 method for class 'data.frame'
Means(data,
    by, weights=NULL, subset=NULL,
    default=NA,
    se=FALSE, ci=FALSE, ci.level=.95,
    counts=FALSE, ...)
## S3 method for class 'formula'
Means(data, subset, weights, ...)
## S3 method for class 'numeric'
Means(data, ...)
## S3 method for class 'means.table'
as.data.frame(x, row.names=NULL, optional=TRUE, drop=TRUE, ...)
## S3 method for class 'xmeans.table'
as.data.frame(x, row.names=NULL, optional=TRUE, drop=TRUE, ...)

Arguments

data

an object usually containing data, or a formula.

If data is a numeric vector or an object that can be coerced into a data frame, it is changed into a data frame and the data frame method of Means() is applied to it.

If data is a formula, then a data frame is constructed from the variables in the formula and Means is applied to this data frame, while the formula is passed on as a by= argument.

by

a formula, a vector of variable names or a data frame or list of factors.

If by is a vector of variable names, they are extracted from data to define the groups for which means are computed, while the variables for which the means are computed are those not named in by.

If by is a data frame or a list of factors, these are used to defined the groups for which means are computed, while the variables for which the means are computed are those not in by.

If by is a formula, its left-hand side determines the variables of which means are computed, while its right-hand side determines the factors that define the groups.

weights

an optional vector of weights, usually a variable in data.

subset

an optional logical vector to select observations, usually the result of an expression in variables from data.

default

a default value used for empty cells without observations.

se

a logical value, indicates whether standard errors should be computed.

ci

a logical value, indicates whether limits of confidence intervals should be computed.

ci.level

a number, the confidence level of the confidence interval

counts

a logical value, indicates whether numbers of valid observations should be reported.

x

for as.data.frame(), a result of Means().

row.names

an optional character vector. This argmument presently is inconsequential and only included for reasons of compatiblity with the standard methods of as.data.frame.

optional

an optional logical value. This argmument presently is inconsequential and only included for reasons of compatiblity with the standard methods of as.data.frame.

drop

a logical value, determines whether "empty cells" should be dropped from the resulting data frame.

...

other arguments, either ignored or passed on to other methods where applicable.

Value

An array that inherits classes "means.table" and "table". If Means was called with se=TRUE or ci=TRUE then the result additionally inherits class "xmeans.table".

Examples

# Preparing example data
USstates <- as.data.frame(state.x77)
USstates <- within(USstates,{
    region <- state.region
    name <- state.name
    abb <- state.abb
    division <- state.division
})
USstates$w <- sample(runif(n=6),size=nrow(USstates),replace=TRUE)

# Using the data frame method
Means(USstates[c("Murder","division","region")],by=c("division","region"))
Means(USstates[c("Murder","division","region")],by=USstates[c("division","region")])
Means(USstates[c("Murder")],1)
Means(USstates[c("Murder","region")],by=c("region"))

# Using the formula method
# One 'dependent' variable
Means(Murder~1, data=USstates)
Means(Murder~division, data=USstates)
Means(Murder~division, data=USstates,weights=w)
Means(Murder~division+region, data=USstates)
as.data.frame(Means(Murder~division+region, data=USstates))

# Standard errors and counts
Means(Murder~division, data=USstates, se=TRUE, counts=TRUE)
drop(Means(Murder~division, data=USstates, se=TRUE, counts=TRUE))
as.data.frame(Means(Murder~division, data=USstates, se=TRUE, counts=TRUE))

# Confidence intervals
Means(Murder~division, data=USstates, ci=TRUE)
drop(Means(Murder~division, data=USstates, ci=TRUE))
as.data.frame(Means(Murder~division, data=USstates, ci=TRUE))

# More than one dependent variable
Means(Murder+Illiteracy~division, data=USstates)
as.data.frame(Means(Murder+Illiteracy~division, data=USstates))

# Confidence intervals
Means(Murder+Illiteracy~division, data=USstates, ci=TRUE)
as.data.frame(Means(Murder+Illiteracy~division, data=USstates, ci=TRUE))

# Some 'non-standard' but still valid usages:
with(USstates,
     Means(Murder~division+region,subset=region!="Northeast"))

with(USstates,
     Means(Murder,by=list(division,region)))

memisc documentation built on March 31, 2023, 7:29 p.m.