write.wtd.pctiles.by.zone: create lookup table as file of pop-weighted or unwtd...
In ejanalysis/ejanalysis: Tools for Environmental Justice (EJ) Analysis

View source: R/write.wtd.pctiles.by.zone.R

write.wtd.pctiles.by.zone

R Documentation

create lookup table as file of pop-weighted or unwtd percentiles, mean, sd for US or by state or region

Description

check which functions actually get used for this now.

Usage

write.wtd.pctiles.by.zone(
  mydf,
  wts = NULL,
  filename = NULL,
  zone.vector = NULL,
  zoneOverallName = "USA"
)

Arguments

`mydf`	data.frame with numeric data. Each column will be examined to calculate mean, sd, and percentiles, for each zone.
`wts`	optional vector of numbers such as population counts as weights, as long as nrow(mydf)
`filename`	prefix to use for filename to be saved locally (.csv is added by the function). If not provided, no file is saved.
`zone.vector`	optional names of states or regions, for example. same length as wts, or rows in mydf
`zoneOverallName`	optonal If not by zone, name of entire domain to use in table column called REGION. Default is USA.

Details

also see:
ejscreen.lookuptables ??

make.bin.pctile.cols uses assign.pctiles

write.wtd.pctiles.by.zone uses wtd.pctiles.exact or pctiles.exact
pctiles.exact uses stats::quantile(x, type = 1, probs = (1:100)/100, na.rm = TRUE)) The inverse of quantile is ecdf (empirical cumulative distribution function) a step function with jumps i/n at observation values, where i is the number of tied observations at that value. Missing values are ignored.

For observations x= (x1,x2, ... xn), Fn is the fraction of observations less or equal to t, i.e.,

Fn(t) = #xi <= t/n = 1/n sum(i=1,n) Indicator(xi <= t).

stats::quantile() can use nine different quantile algorithms discussed in Hyndman and Fan (1996), – Hyndman, R. J. and Fan, Y. (1996) Sample quantiles in statistical packages, American Statistician 50, 361–365. doi: 10.2307/2684934. defined as weighted averages of consecutive order statistics.

type 1 is used here, and is "Inverse of empirical distribution function."

  Sample quantiles of type i are defined by:

               Q[i](p) = (1 - Y) x[j] + Y x[j+1],

               i = type of formula to use (1 through 9).
               p is the percentage (0 through 1).
               x[j] is the jth order statistic,
               (j-m)/n less than or equal to p < (j-m+1)/n,
               n is the sample size, the value of
               Y is a function of
                 j = floor(np + m)     --so this is roughly how many data points are smaller.
                 g =       np + m - j,  --so this is roughly
                 m = a constant determined by the sample quantile type. (m= 0 or -0.5 here)
    Discontinuous sample quantile types 1, 2, and 3

    For types 1, 2 and 3, Q[i](p) is a discontinuous function of p, with
    m = 0 when i = 1 or i = 2, and m = -1/2 when i = 3.
    Type 1 =  Inverse of empirical distribution function. Y = 0 if g = 0, and 1 otherwise.

wtd.pctiles.exact uses
  Hmisc::wtd.quantile(x, wts, type = "i/n", probs = (1:100)/100) ,  na.rm = na.rm))
  "i/n" uses the inverse of the empirical distribution function,
  using  wt/T, where wt is the cumulative weight and T is the total weight (usually total sample size).

table.pop.pctile and

map service with lookup tables

Examples


  # bg = ejscreen::bg21plus # want demog subgroups but also want PR eventually
  # pctilevariables <- c(names.e, names.d, names.d.subgroups, names.ej)
  # ejanalysis::write.wtd.pctiles(mydf = bg[ , pctilevariables], wts = bg$pop, filename =  'lookupUSA21')
  # ejanalysis::write.wtd.pctiles.by.zone(mydf = bg[ , pctilevariables], wts = bg$pop,
  #                                 zone.vector = bg$REGION, filename =  'lookupRegions21')
  # ejanalysis::write.wtd.pctiles.by.zone(mydf = bg[ , pctilevariables], wts = bg$pop,
  #                                 zone.vector = bg$ST,     filename =  'lookupStates21')

ejanalysis/ejanalysis documentation built on Oct. 10, 2024, 6:44 p.m.