library(tidyverse)
library(edr)
library(readxl)

5.2. Using readr to Import Data from CSV Files

Listing 5.1. Using the dataset_to_csv() function (exclusive to the edr package and its datasets) to write us_cities.csv to the project directory.

# dataset_to_csv(dataset = "us_cities")

Listing 5.3. Using the read_csv() function to read the us_cities.csv to a tibble object.

city_table <- read_csv(file = "us_cities.csv")

city_table

Listing 5.4. Using the read_csv() with us_cities.csv, specifying the exact column types in col_types.

city_table <- read_csv(file = "us_cities.csv", col_types = "cccii")

city_table

5.3 Using readxl to import Excel data

Listing 5.5. Using the create_excel_file() function to write us_cities.xlsx to the project directory.

# create_excel_file()

Listing 5.6. Using the read_excel() function to read the first sheet from the us_cities.xlsx file to a tibble object.

city_table_x <- read_excel(path = "us_cities.xlsx")

city_table_x

Listing 5.7. Using the read_excel() function to read in the second sheet (us_cities_messy) from the us_cities.xlsx file. The result is not ideal.

city_table_x_m <- read_excel("us_cities.xlsx", sheet = 2)

city_table_x_m

Listing 5.8. Using the read_excel() function with the us_cities_messy sheet, take 2. With some options (and some care!) we get a usable tibble.

city_table_x_m <- 
  read_excel(
    "us_cities.xlsx",
    sheet = 2,
    skip = 2,
    col_types = c(rep("text", 3), rep("numeric", 2), rep("skip", 2)),
    na = "N/A")

city_table_x_m

Listing 5.9. The column names (after importing from Excel) can be easily changed using dplyr’s rename() function.

city_table_x_m <- 
  city_table_x_m %>%
  rename(
    state_id = `state id`,
    state_name = `state name`,
    pop_urb = `urban population`,
    pop_mun = `municipal population`
  )

city_table_x_m


rich-iannone/rwr documentation built on Jan. 22, 2021, 7:51 p.m.