assign.pctiles.alt2: Assign percentiles to values (alternative formula, and not by...

View source: R/assign.pctiles.alt2.R

assign.pctiles.alt2R Documentation

Assign percentiles to values (alternative formula, and not by zone)

Description

For each column look at the distribution of values across all rows, and find what percentile a given value is at.

Usage

assign.pctiles.alt2(x, weights = NULL, na.rm = TRUE, zone = NULL)

Arguments

x

vector or data.frame

weights

Optional, NULL by default (not fully tested), vector of weights for weighted percentiles (e.g., population weighted).

na.rm

Logical, optional, TRUE by default. Should NA values (missing data) be removed first to get percentile of those with valid data. If FALSE, NA values are treated as being at the high percentiles.

zone

Optional, NULL by default, *** not yet implemented here.

Details

Assign percentile as cumulative sum of (the weights ranked by the value x). Then fixes ties. # Could also add parameter like in rank(), na.last, defining na.rm but also where to rank NA values if included, etc. Default now is like na.last=NA, but like na.last='last' if na.rm=FALSE Could also add parameter like in rank(), ties.method, defining if ties get min, max, or mean of percentiles initially assigned to ties. Default for ties right now is like ties.method=max (which might not be what assign.pctiles() does in fact).

Value

Returns a numeric vector or data.frame same size as x.

See Also

make.bin.pctile.cols() and assign.pctiles()

Examples

x <- c(30, 40, 50, 12,12,5,5,13,13,13,13,13,8,9,9,9,9,9,10:20,20,20,20,21:30)
weights <- rep(c(2,3), length(x)/2)
cbind(weights, x, PCTILE=assign.pctiles.alt2(x,weights))

# PERCENTILE OF ALL, NOT JUST THOSE WITH VALID DATA, IF na.rm=FALSE,
# but then NA values preclude high percentiles:
x <- c(NA, NA, NA, NA,NA,NA,NA,NA,NA,NA,13,13,8,9,9,9,9,9,10:20,20,20,20,21:30)
weights <- rep(c(2,3), length(x)/2)
cbind(weights, x, PCTILE.alt2=assign.pctiles.alt2(x, weights, na.rm=FALSE),
 pctile=assign.pctiles(x,weights))[order(x),]
cbind(weights, x, PCTILE.alt2=assign.pctiles.alt2(x, weights, na.rm=TRUE),
 pctile=assign.pctiles(x,weights))[order(x),]

V=9
sum(weights[!is.na(x) & x <= V]) / sum(weights[!is.na(x)])

#A value (V) being at this PCTILE% means that (assuming na.rm=TRUE):

# V >= x  for        PCTILE% of weights     (for non-NA x), so
# V < x   for 100% - PCTILE% of weights     (for non-NA x), or
# PCTILE% of all weights have V >= x (for non-NA x), so
# 100% - PCTILE% of all weights have V < x  (for non-NA x).

x <- c(32, NA, NA, NA,NA,NA,NA,NA,NA,NA,13,13,8,9,9,9,9,9,10:20,20,NA,20,21:30)
weights <- rep(c(2,3), length(x)/2)
cbind(weights, x, PCTILE.alt2=assign.pctiles.alt2(x, weights, na.rm=FALSE),
 pctile=assign.pctiles(x,weights))[order(x),]
cbind(weights, x, PCTILE.alt2=assign.pctiles.alt2(x, weights, na.rm=TRUE),
 pctile=assign.pctiles(x,weights))[order(x),]

ejanalysis/ejanalysis documentation built on April 2, 2024, 10:12 a.m.