Overview

Workshop Goals


Topics

  1. Setting Up an R Analysis Environment
  2. Accessing the Data
  3. Generating Estimates
  4. Creating New Variables
  5. Visualizing Estimates
  6. Producing a Travel Analysis Report

"House Cleaning"

This presentation is intended for re-use

\

Presentation hotkeys

| Key | Action | |:----|:-----------------------------------------| | C | Show table of contents | | F | Toggles the display of the footer | | A | Toggles display of current vs all slides | | S | Make fonts smaller | | B | Make fonts larger |


Next Topic {#transition_slide}

  1. Setting Up an R Analysis Environment
  2. Accessing the Data
  3. Generating Estimates
  4. Creating New Variables
  5. Visualizing Estimates
  6. Producing a Travel Analysis Report

Setting Up an R Analysis Environment

Click here for latest installation instructions

Instructions provide explicit download links for

Once installed, open RStudio


Setting Up an R Analysis Environment (cont.)

Make sure you have the "summarizeNHTS" R software package installed

install.packages("devtools")
devtools::install_github("Westat-Transportation/summarizeNHTS")

And now that we have installed the necessary software, load the software

library(summarizeNHTS)

OK, we're ready!


Next Topic {#transition_slide}

  1. Setting Up an R Analysis Environment
  2. Accessing the Data
  3. Generating Estimates
  4. Creating New Variables
  5. Visualizing Estimates
  6. Producing a Travel Analysis Report

Accessing the Data

Before we begin, let's make sure the summarizeNHTS package is loaded.

library(summarizeNHTS)

Accessing the Data: Downloading NHTS Data

# Not Run
download_nhts_data("2001")
download_nhts_data("2009")
download_nhts_data("2017")

Accessing the Data: Reading the 2017 NHTS data

nhts_data <- read_data("2017", "C:/NHTS")

Accessing the Data: Summarizing the data object

summary(nhts_data$data)

Accessing the Data: Snapshot of the vehicle data

nhts_data$data$vehicle

Accessing the Data: Subsetting

# By position
nhts_data$data$vehicle[, c(1, 3)]

# By name (single variable)
nhts_data$data$vehicle$ANNMILES

# By name
nhts_data$data$vehicle[, list(HOUSEID, ANNMILES)]
# By row numbers (first 5 rows)
nhts_data$data$vehicle[1:5, ]

# By condition
nhts_data$data$vehicle[VEHTYPE == "01", ]

# By condition (multiple values)
nhts_data$data$vehicle[VEHTYPE %in% c("01","02"), ]

Accessing the Data: Codebook objects

# 2017 variables table
head(codebook_2017$variables)

# 2017 values table
head(codebook_2017$values)

Next Topic {#transition_slide}

  1. Setting Up an R Analysis Environment
  2. Accessing the Data
  3. Generating Estimates
  4. Creating New Variables
  5. Visualizing Estimates
  6. Producing a Travel Analysis Report

Generating Estimates


Generating Estimates: Introduction to summarize_data

summarize_data(
  data = nhts_data,
  agg = "household_count"
)

Generating Estimates: Exploring summarize_data Parameters

summarize_data(
  data = nhts_data,
  agg = "household_count"
)

Generating Estimates: Grouping by Variables

summarize_data(
  data = nhts_data,
  agg = "household_count",
  by = "IS_METRO"
)
summarize_data(
  data = nhts_data,
  agg = "household_count",
  by = c("IS_METRO","HOMEOWN")
)

Generating Estimates: Frequencies/Proportions

# Person count
summarize_data(
  data = nhts_data,
  agg = "person_count"
)
# Proportion of persons by WORKER, worker status
summarize_data(
  data = nhts_data,
  agg = "person_count",
  by = "WORKER",
  prop = TRUE
)

Generating Estimates: Numeric Aggregates

# Average TRPMILES, trip distance in miles
summarize_data(
  data = nhts_data,
  agg = "avg",
  agg_var = "TRPMILES"
)

Notes


Generating Estimates: Trip Rates

# Daily person Trips by worker status
summarize_data(
  data = nhts_data,
  agg = "person_trip_rate",
  by = "WORKER"
)

Generating Estimates: Subsetting in summarize_data

# Distribution of social/recreational trips by travel day
summarize_data(
  data = nhts_data,
  agg = "trip_count",
  by = "TRAVDAY",
  prop = TRUE,
  subset = "WHYTRP90 %in% c('07','08','10')"
)
# Person trip rate by Sex (for millennials)
summarize_data(
  data = nhts_data,
  agg = "person_trip_rate",
  by = "R_SEX",
  subset = "R_AGE >= 18 & R_AGE <= 34"
)

Generating Estimates: Documentation

?summarize_data

R Documentation for summarize_data


Next Topic {#transition_slide}

  1. Setting Up an R Analysis Environment
  2. Accessing the Data
  3. Generating Estimates
  4. Creating New Variables
  5. Visualizing Estimates
  6. Producing a Travel Analysis Report

Creating New Variables


Creating New Variables: Example Scenario

Example Derived Variable Coding Scenario

1) Someone's interested in querying the NHTS for a particular travel behavior

Anthony: "I am interested in exploring how financial burden may affect travel."
Alex: "Remember that question about walking to save money? I would include that in your analysis."

2) Consider suggested variable's usefulness for Anthony's analysis:

WALK2SAVE: "I walk to places to save money."

Values:

| | | |:------|---------------------------| | 01 | Strongly agree | | 02 | Agree | | 03 | Neither Agree or Disagree | | 04 | Disagree | | 05 | Strongly disagree |

3) Look for potential other ways of maniupulating this variable for analysis

4) Create variable called WALK_FINANCE, a yes/no variable for the binary analysis question, "who does or does not walk to save money?"


Creating New Variables: Configuration


Creating New Variables: Configuration (cont.)

Derived variable file requirements

| Item | Description | |:-------|:----------------------------------------------------------| | NAME | The name of the variable as it will appear in the dataset | | TABLE | The table level this variable is being computed for | | TYPE | Data type (numeric or character) | | DOMAIN | Logical expression that decides value assignment | | VALUE | A variable code value | | LABEL | Description of code value |

Review File


Creating New Variables: Example 1 (Has/Has-not)

Using the Derived Variables file, create a variable with the following requirements:

\

| NAME | TABLE | TYPE | DOMAIN | VALUE | LABEL | |:------------|:----------|:----------|:--------------|:------|:------| | HAS_VEHICLE | household | character | HHVEHCNT > 0 | 1 | Yes | | HAS_VEHICLE | household | character | HHVEHCNT == 0 | 2 | No |


Creating New Variables: Example 2 (Grouping)

Using the Derived Variables file, create a variable with the following requirements:

\

| NAME | TABLE | TYPE | DOMAIN | VALUE | LABEL | |:----------|:-------|:----------|:--------------------------|:------|:-------------| | AGE_GROUP | person | character | R_AGE >= 0 & R_AGE <= 17 | 1 | Child | | AGE_GROUP | person | character | R_AGE >= 18 & R_AGE <= 44 | 2 | Young Adult | | AGE_GROUP | person | character | R_AGE >= 45 & R_AGE <= 65 | 3 | Middle Adult | | AGE_GROUP | person | character | R_AGE >= 66 | 4 | Older Adult |


Creating New Variables: Example 3 (Uses/Does-not-use)

Using the Derived Variables file, create a variable with the following requirements:

\

| NAME | TABLE | TYPE | DOMAIN | VALUE | LABEL | |:----------|:-------|:----------|:---------------|:------|:------| | USES_TNC | person | character | RIDESHARE > 0 | 1 | Yes | | USES_TNC | person | character | RIDESHARE == 0 | 2 | No |


Creating New Variables: Example 4 (Is/Is-not)

Using the Derived Variables file, create a variable with the following requirements:

\

| NAME | TABLE | TYPE | DOMAIN | VALUE | LABEL | |:---------|:----------|:----------|:------------------------------|:------|:------| | IS_METRO | household | character | MSACAT %in% c('01','02','03') | 1 | Yes | | IS_METRO | household | character | MSACAT %in% c('04') | 2 | No |


Creating New Variables: Summary


Next Topic {#transition_slide}

  1. Setting Up an R Analysis Environment
  2. Accessing the Data
  3. Generating Estimates
  4. Creating New Variables
  5. Visualizing Estimates
  6. Producing a Travel Analysis Report

Visualizing Estimates


Visualizing Estimates: Tables (Introductory)

statistic <- summarize_data(
  data = nhts_data,
  agg = "person_trip_rate",
  by = "WORKER"
)

make_table(statistic)

Visualizing Estimates: Tables (Advanced)

statistic <- summarize_data(
  data = nhts_data,
  agg = "person_count",
  by = c("TRAVDAY","OCCAT","EDUC"),
  exclude_missing = TRUE
)

make_table(
  tbl = statistic,
  title = "Table 1: Distribution of Persons (%) by Travel Day, Job Category, and Educational Attainment",
  output = c(W = "Weighted Percentage", N = "Sample Size"),
  row_vars = c("EDUC","OCCAT")
)

Visualizing Estimates: Charts (Introductory)

statistic <- summarize_data(
  data = nhts_data,
  agg = "person_trip_rate",
  by = "WORKER",
  exclude_missing = TRUE
)

make_chart(statistic)

Visualizing Estimates: Charts (Advanced)

Person Trip Rate by Sex, Worker Status, and Travel Day of Week

statistic <- summarize_data(
  data = nhts_data,
  agg = "person_trip_rate",
  by = c("R_SEX","WORKER","TRAVDAY"),
  exclude_missing = TRUE
)
# Specify fill and facet
make_chart(
  tbl = statistic, 
  fill = "WORKER",
  facet = "TRAVDAY",
  palette = "Accent"
)

Visualizing Estimates: Maps (Introductory)

statistic <- summarize_data(
  data = nhts_data,
  agg = "person_count",
  by = "CENSUS_D"
)

make_map(statistic)

Visualizing Estimates: Maps - Built in Geography Layers


Visualizing Estimates: Maps (Advanced)

Include a second table grouping by the original geography plus one variable.

statistic1 <- summarize_data(
  data = nhts_data,
  agg = "person_trip_rate",
  by = "HHSTFIPS",
  exclude_missing = TRUE
)

statistic2 <- summarize_data(
  data = nhts_data,
  agg = "person_trip_rate",
  by = c("HHSTFIPS","WORKER"),
  exclude_missing = TRUE
)

map <- make_map(
  tbl = statistic1, 
  tbl2 = statistic2
)
map

Formatting

statistic <- summarize_data(
  data = nhts_data,
  agg = "trip_count",
  by = "PRMACT",
  exclude_missing = TRUE
)

make_table(
  tbl = statistic,
  title = "Trip Count by Primary Activity (in Millions)",
  output = c(W = "Trip Count (Millions)", E = "SE"),
  digits = 0,
  multiplier = 1000000
)

Next Topic {#transition_slide}

  1. Setting Up an R Analysis Environment
  2. Accessing the Data
  3. Generating Estimates
  4. Creating New Variables
  5. Visualizing Estimates
  6. Producing a Travel Analysis Report

Producing a Travel Analysis Report



Westat-Transportation/summarizeNHTS documentation built on May 17, 2020, 8:57 p.m.