View source: R/assign.pctiles.alt2.R
assign.pctiles.alt2 | R Documentation |
For each column look at the distribution of values across all rows, and find what percentile a given value is at.
assign.pctiles.alt2(x, weights = NULL, na.rm = TRUE, zone = NULL)
x |
vector or data.frame |
weights |
Optional, NULL by default (not fully tested), vector of weights for weighted percentiles (e.g., population weighted). |
na.rm |
Logical, optional, TRUE by default. Should NA values (missing data) be removed first to get percentile of those with valid data. If FALSE, NA values are treated as being at the high percentiles. |
zone |
Optional, NULL by default, *** not yet implemented here. |
Assign percentile as cumulative sum of (the weights ranked by the value x). Then fixes ties. # Could also add parameter like in rank(), na.last, defining na.rm but also where to rank NA values if included, etc. Default now is like na.last=NA, but like na.last='last' if na.rm=FALSE Could also add parameter like in rank(), ties.method, defining if ties get min, max, or mean of percentiles initially assigned to ties. Default for ties right now is like ties.method=max (which might not be what assign.pctiles() does in fact).
Returns a numeric vector or data.frame same size as x.
make.bin.pctile.cols()
and assign.pctiles()
x <- c(30, 40, 50, 12,12,5,5,13,13,13,13,13,8,9,9,9,9,9,10:20,20,20,20,21:30)
weights <- rep(c(2,3), length(x)/2)
cbind(weights, x, PCTILE=assign.pctiles.alt2(x,weights))
# PERCENTILE OF ALL, NOT JUST THOSE WITH VALID DATA, IF na.rm=FALSE,
# but then NA values preclude high percentiles:
x <- c(NA, NA, NA, NA,NA,NA,NA,NA,NA,NA,13,13,8,9,9,9,9,9,10:20,20,20,20,21:30)
weights <- rep(c(2,3), length(x)/2)
cbind(weights, x, PCTILE.alt2=assign.pctiles.alt2(x, weights, na.rm=FALSE),
pctile=assign.pctiles(x,weights))[order(x),]
cbind(weights, x, PCTILE.alt2=assign.pctiles.alt2(x, weights, na.rm=TRUE),
pctile=assign.pctiles(x,weights))[order(x),]
V=9
sum(weights[!is.na(x) & x <= V]) / sum(weights[!is.na(x)])
#A value (V) being at this PCTILE% means that (assuming na.rm=TRUE):
# V >= x for PCTILE% of weights (for non-NA x), so
# V < x for 100% - PCTILE% of weights (for non-NA x), or
# PCTILE% of all weights have V >= x (for non-NA x), so
# 100% - PCTILE% of all weights have V < x (for non-NA x).
x <- c(32, NA, NA, NA,NA,NA,NA,NA,NA,NA,13,13,8,9,9,9,9,9,10:20,20,NA,20,21:30)
weights <- rep(c(2,3), length(x)/2)
cbind(weights, x, PCTILE.alt2=assign.pctiles.alt2(x, weights, na.rm=FALSE),
pctile=assign.pctiles(x,weights))[order(x),]
cbind(weights, x, PCTILE.alt2=assign.pctiles.alt2(x, weights, na.rm=TRUE),
pctile=assign.pctiles(x,weights))[order(x),]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.