RR.means: Relative Risk (RR) components - Means in demographic group...

View source: R/RR.means.R

RR.meansR Documentation

Relative Risk (RR) components - Means in demographic group and in rest of pop, based on Census data

Description

Finds the mean indicator value in one demographic subgroup and mean in everyone else, based on data for each spatial unit such as for block groups or tracts.

Usage

RR.means(
  e,
  d,
  pop,
  dref,
  dlab = c("group", "not"),
  formulatype = "manual",
  na.rm = TRUE
)

Arguments

e

Vector or data.frame or matrix with 1 or more environmental indicator(s) or health risk level (e.g., PM2.5 concentration to which this person or place is exposed), one row per Census unit and one column per indicator.

d

Vector or data.frame or matrix with 1 or more demog groups percentage (as fraction of 1, not 0-100!) of place that is selected demog group (e.g. percent Hispanic) (or d=1 or 0 if this is a vector of individuals)

pop

Vector of one row per location providing population count of place (or pop=1 if this is a vector of individuals), to convert d into a count since d is a fraction

dref

optional - NONDEFAULTS ARE NOT YET CONFIRMED TO WORK HERE. This specifies reference group, default is 1 - d, meaning all people who are not in given group, so each D group is compared to all non-D people. This is why dlab has the default it does. But dref can be used to specify a single reference group used for all D analyzed. For the reference group to be the entire overall population, use dref = 1, and say dlab = c('avg in group', 'avg overall') for example. To use some other reference group, just set dref = a fraction of 1 that is the share of local pop that is in the reference group, as a vector as long as the number of places (number of rows in e or d). e.g., for non-Hispanic White alone to be the reference group, if the bg data.frame has a field called pctnhwa, specify dref = bg$pctnhwa, and specify dlab = c('mean in this group', 'mean in reference group') for example.

dlab

optional character vector of two names for columns of output, default is c('group', 'not'), where the second refers to the reference group used.

formulatype

Optional, default is 'manual', which is like sum(x * wts) / sum(wts) with na.rm=T, or formulatype can be 'Hmisc' to use Hmisc::wtd.mean() (Hmisc::wtd.mean), or 'base' to use weighted.mean()

na.rm

optional, default is TRUE. No effect if formulatype = manual (default). Not really the right results when na.rm = FALSE anyway.

Details

This function requires, for each Census unit, demographic data on total population and percent in each demographic group, and some indicator(s) for each Census unit, such as health status, exposure estimates, or environmental health risk. For example, given population count, percent Hispanic, and ppm of ozone for each tract, this calculates the population mean tract-level ozone concentration among Hispanics and the same value among all non-Hispanics. The result is a table of means for a demographic subset, or for each of several groups and indicators. Each e (for environmental indicator) or d (for demographic percentage) is specified as a vector over small places like Census blocks or block groups or even individuals (ideally) but then d would be a dummy=1 for selected group and 0 for people not in selected group NOTE: could NA values cause a problem here?

Value

numeric results as array

See Also

RR()

Examples


 # See examples for [RR.table()] and [RR.means()] and [RR()]

 ########################################  #

 ##    if just using ejanalysis pkg test data:
 bg <- ejanalysis::bgtest
  enames <- c("pm", "o3", "cancer", "resp", "dpm", "pctpre1960", "traffic.score",
   "proximity.npl", "proximity.rmp", "proximity.tsdf", "proximity.npdes", "ust")
 dnames = c("pctlingiso", "pctlowinc")
 dnames.subgroups.count =  c("hisp", "nhwa", "nhba", "nhaiana",
   "nhaa", "nhnhpia", "nhotheralone", "nhmulti")
 dnames.subgroups.pct = c("pcthisp", "pctnhwa", "pctnhba", "pctnhaiana",
   "pctnhaa", "pctnhnhpia", "pctnhotheralone", "pctnhmulti")

 ##    if EJAM pkg available:
 # bg <- as.data.frame(EJAM::blockgroupstats)
 # enames = EJAM::names_e
 # dnames = EJAM::names_d
 # dnames.subgroups.count = EJAM::names_d_subgroups_count
 # dnames.subgroups.pct  =  EJAM::names_d_subgroups

 ##    if EJAM pkg not available and using ejscreen pkg data:
 # bg <- ejscreen::bg22
 # enames = ejscreen::names.e
 # dnames = ejscreen::names.d
 # dnames.subgroups.count = ejscreen::names.d.subgroups
 # dnames.subgroups.pct  =  ejscreen::names.d.subgroups.pct

 ########################################  #

 # stats on 1 Demographic group
 drop(RR.means(bg[ , enames], bg$pcthisp, bg$pop))
 # all E, all D
 x <- RR.means(bg[, enames], bg[, dnames], bg$pop)
 round(x[, 'ratio',],2)
 x[ , , 'traffic.score']
 x['pctlowinc', , ]

 densities <- round(drop(RR.means(e = data.frame(pop.density = 1000 * bg$pop / bg$area),
  d = bg[, c(dnames, dnames.subgroups.pct)], pop = bg$pop, dlab = c('Avg pop density', 'Avg if not in this demog group'))),2)
 densities[order(densities[, 3], decreasing = T),]

## Not run: 
# for ejscreen pkg data.frame
# population density of blockgroup, by demographic group
densities <- round(drop(RR.means(
  e = data.frame(pop.density = 1000 * bg$pop / bg$area),
  d = bg[, c(dnames, dnames.subgroups.pct)],
  pop = bg$pop,
  dlab = c('Avg pop density', 'Avg if not in this demog group'))),2)
# just 1 Envt factor
 drop(RR.means(bg[, "traffic.score"], bg[,c(dnames, dnames.subgroups.pct)], pop = bg$pop))
 # just 1 Demog group
 drop(RR.means(bg[, enames], bg[, "pctlowinc"], pop = bg$pop))
 # multiple E, multiple D
 RR.means(bg[, enames], bg[, c("pctlowinc", "pctunemployed")], pop = bg$pop)
 # All E, All D (a bit slow)
 x <- RR.means(bg[, enames], bg[, dnames], pop = bg$pop)
 t(round(x[,'ratio',],2))

 # x <- ejanalysis::RR.means(
 #  bg[ , enames],
 #  bg[ , c(dnames, dnames.subgroups.pct)] / 100,
 #  bg$pop
 # )
 
## End(Not run)


ejanalysis/ejanalysis documentation built on April 2, 2024, 10:12 a.m.