pop.cdf: Draw PDF (overlays histograms) comparing distributions of...

View source: R/pop.cdf.R

pop.cdfR Documentation

Draw PDF (overlays histograms) comparing distributions of scores in selected demographic groups

Description

Draws a histogram plot using plotrix::weighted.hist(), overlaying distribution functions, one for each subgroup specified. Useful to compare 2 groups based on each groups entire pdf distribution of peoples scores, using data from small places like census block groups, based on having for each place the pop total and % of pop that is in each group or perhaps already have count in each group.

Usage

pop.cdf(
  scores,
  pcts,
  pops,
  allothers = TRUE,
  col = "lightblue",
  main,
  weights,
  ...
)

Arguments

scores

Numeric vector (not data.frame currently), required. Values to analyze.

pcts

Numeric vector or data.frame, required. Same number of vector elements or data.frame rows as length of scores. Specifies the fraction of population that is in demographic group(s) of interest, one row per place, one group per column.

pops

Vector used to define weights as poppcts, and if allothers=TRUE, for pop(1-pcts) for nongroup

allothers

Logical value, optional, TRUE by default. Whether to plot a series for everyone else, using 1-pct

col

Optional, default is 'red' to signify line color red for key demographic group. Can also be a vector of colors if pcts is a data.frame with one column per group, one color per group.

main

Optional character specifying plot title. Default title notes colors of lines and if reference group used.

weights

Not used currently (see pop parameter)

...

other optional parameters to pass to weighted.hist()

Details

Notes:
to compare zones,
compare demog groups, (see parameter called group)
compare multiple groups and/or multiple zones, like hisp vs others in us vs ca all on one graph
see plotrix::weighted.hist() for options

Value

Draws a plot

See Also

Hmisc::Ecdf() RR() pop.cdf() pop.cdf2() pop.ecdf() pop.cdf.density()

Examples

## #
## Not run: 


  bg <- ejscreen::bg22[, c(ejscreen::names.d, 'pop', ejscreen::names.e, 'REGION')]

e <- bg$pm[!is.na(bg$pm)]
dpct <- bg$pctmin
dcount   <- bg$pop[!is.na(bg$pm)] *      dpct[!is.na(bg$pm)]
refcount <- bg$pop[!is.na(bg$pm)] * (1 - dpct[!is.na(bg$pm)])
brks <- 0:17
etxt <- 'PM2.5'
dtxt <- 'Minorities'

pop.cdf(        e, pcts = dpct, pops = bg$pop)
pop.cdf2(       e, dcount, refcount, etxt, dtxt, brks)
pop.cdf.density(e, dcount, refcount, etxt, dtxt )


# pop.cdf( 31:35, c(0.10, 0.10, 0.40, 0, 0.20), 1001:1005 )

set.seed(99)
pctminsim=c(runif(7000,0,1), pmin(rlnorm(5000, meanlog=log(0.30), sdlog=1.7), 4)/4)
popsim= runif(12000, 500, 3000)
esim= rlnorm(12000, log(10), log(1.15)) + rnorm(12000, 1, 0.5) * pctminsim - 1
pop.cdf(esim, pctminsim, popsim, xlab='Tract air pollution levels',
  main = 'Air pollution levels among minorities (red bars) vs rest of US pop.')

#
# pop.cdf(bg$pm, bg$pctmin, bg$pop)
# pop.cdf(log10(places$traffic.score), places$pctmin, places$pop)
# pop.cdf(places$cancer, places$pctmin, places$pop, allothers=FALSE)
# pop.cdf(places$cancer, places$pctlingiso, places$pop, col='green', allothers=FALSE, add=TRUE)
# Demog suscept  for each REGION (can't see if use vs others)
pop.cdf(bg$traffic.score, bg$VSI.eo, bg$pop, log='x', subtitles=FALSE,
         group=bg$REGION, allothers=FALSE,
         xlab='Traffic score (log scale)', ylab='frequency in population',
          main='Distribution of scores by EPA Region')

# Demog suscept (how to show vs others??), one panel per ENVT FACTOR (ie per col in scores df)
data('names.e')
# NOT
pop.cdf(bg[ , names.e], bg$VSI.eo, bg$pop, log='x', subtitles=FALSE,
         allothers=TRUE, ylab='frequency in population',
          main='Distribution of scores by EPA Region')

# log scale is useful & so are these labels passed to function
# in CA vs not CA
pop.cdf(bg$traffic.score, bg$ST=='CA', bg$pop,
         subtitles=FALSE,
         log='x', ylab='frequency in population', xlab='Traffic scores (log scale)',
         main='Distribution of scores in CA (red) vs rest of US')

# Flagged vs not (all D, all zones)
pop.cdf(bg$traffic.score, bg$flagged, bg$pop, log='x')

# D=Hispanics vs others, within CA zone only
pop.cdf(bg$traffic.score, bg$ST=='CA', bg$pop * bg$pcthisp, log='x')
# Demog suscept vs others, within CA only
pop.cdf(bg$traffic.score, bg$ST=='CA', bg$pop * bg$VSI.eo, log='x')


## End(Not run)

ejanalysis/ejanalysis documentation built on April 2, 2024, 10:12 a.m.