get_population: Get standard population estimates from SQL

View source: R/get_population.R

get_populationR Documentation

Get standard population estimates from SQL

Description

Simple front-end for pulling in standard population data from SQL

Usage

get_population(
  kingco = T,
  years = NA,
  ages = c(0:100),
  genders = c("f", "m"),
  races = c("aian", "asian", "black", "hispanic", "multiple", "nhpi", "white"),
  race_type = c("race_eth"),
  geo_type = c("kc"),
  group_by = NULL,
  round = FALSE,
  mykey = "hhsaw",
  census_vintage = 2020,
  geo_vintage = 2020,
  schema = "ref",
  table_prefix = "pop_geo_",
  return_query = FALSE
)

Arguments

kingco

Logical vector of length 1. Identifies whether you want population estimates limited to King County. Only impacts results for geo_type in c('blk', blkgrp', 'lgd', 'scd', 'tract', 'zip').

Default == TRUE.

years

Numeric vector. Identifies which year(s) of data should be pulled.

Default == 2022.

ages

Numeric vector. Identifies which age(s) should be pulled.

Default == c(0:100), with 100 being the top coded value for 100:120.

genders

Character vector of length 1 or 2. Identifies gender(s) should be pulled. The acceptable values are 'f', 'female', 'm', and 'male'.

Default == c('f', 'm').

races

Character vector of length 1 to 7. Identifies which race(s) or ethnicity should be pulled. The acceptable values are "aian", "asian", "black", "hispanic", "multiple", "nhpi", and "white". Note that "hispanic' is only valid when race_type = 'race_eth'.

Default == all the possible values.

race_type

Character vector of length 1. Identifies whether to pull race data with Hispanic as an ethnicity ("race"), Hispanic as a race ("race_eth"), or each race, including Hispanic, alone and in combination ('race_aic').

Default == c("race_eth").

geo_type

Character vector of length 1. Identifies the geographic level for which you want population estimates. The acceptable values are: 'blk', 'blkgrp', 'county', 'hra', 'kc', 'lgd' (WA State legislative districts), 'region', 'seattle', 'scd' (school districts), 'tract', 'wa', and 'zip'.

Default == "kc".

group_by

Character vector. Identifies how you would like the data 'grouped' (i.e., stratified). Valid options are limited to: "years", "ages", "genders", "race", "race_eth", "race_aic", and/or "hispanic". "hispanic" can only be specified when race_type = 'race_eth', which returns rows for Race-non Hispanic and Race-Hispanic. Both race_eth and hispanic can be included in the group_by argument at the same time. If race_type = 'race' or race_type = 'race_aic', then 'race' or 'race_aic' must be included in group_by, respectively. Results are always grouped by geo_id.

Default == NULL, i.e., estimates are only grouped / aggregated by geography.

round

Logical vector of length 1. Identifies whether or not population estimates should be returned as whole numbers.

Default == FALSE. As of 02/2023.

mykey

Character vector of length 1 OR a database connection (via DBI::dbConnect()). If the former, it should identify the keyring:: key that can be used to access the Health & Human Services Analytic Workspace (HHSAW).

Default == 'hhsaw'

census_vintage

Integer. One of 2010 or 2020. Refers to latest Census to influence the set of population estimates

Default == 2020

geo_vintage

Integer. One of 2010 or 2020. Refers to the the Census that influenced the creation of the geographies. See details for notes.

Default == 2020

schema

character. Name of the schema in the db where pop data is stored. Unless you know what you are doing, do not change the default!

Default = 'ref'

table_prefix

character. Prefix of the tables in schema where pop data is stored. The table will be selected as {schema}.pop_geo_{geo_type} unless geo_type is aggregated on the fly from blocks. Unless you know what you are doing, do not change the default!

Default = 'pop_geo_'

return_query

logical. Instead of computing the results, return the query for fetching the results

Default == FALSE

Details

Note the following geography limitations:

– 'county', 'lgd', 'scd', and 'zip' apply to all of WA State

– 'hra', 'kc', 'region', and 'seattle' apply to King County only

– 'blk', 'blkgrp', and 'tract' apply to King, Snohomish, and Pierce counties only

Note on geo_vintage: ZIP codes and school districts (scd) are unaffected by geo_vintage. ZIP codes are year specific when possible and school districts are mostly considered fixed.

For all other geographies, the value should represent the vintage/era of the Census

Value

dataset as a data.table for further analysis/tabulation

References

https://github.com/PHSKC-APDE/rads/wiki/get_population

Examples


 a = get_population(geo_type = "region")
 print(a)


PHSKC-APDE/rads documentation built on April 14, 2025, 10:47 a.m.