knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(processFCC)

Loading the package

Install and load the package with the following code:

install.packages("devtools")
library(devtools)
install_github("kdmulligan/processFCC")
library(processFCC)

Explanation of FCC Form History

The Federal Communications Commission (FCC) has collected data from all facilities-based broadband providers twice a year using Form 477. This form includes information about where and what type of Internet access is provided at the census block level. However, this form was discontinued in in December 2021.

Instead, broadband providers now submit information about their broadband internet access services in the broadband data collection system. Within this system, data is reported more granularity than the census block level. For more information about the filing system, please reference the form resources website.

Because of these changes in reporting, one of the first things you will need to do is decide which time points of the FCC fixed broadband data you wish to use and process. Dates from December 2021 and prior use the “old” format whereas dates June 2022 and beyond use the “new” format. For both formats, fixed broadband data is available at two time points per year: June and December. Use new_or_old_FCC() to check which dates are available and also the format (new or old) that should be used to process the data. The output of the function provides the available dates, the format, the function to use to process the data, and the website if you would like more information.

New FCC data format

To process a data time point in the “new” format you will need functions: avail_new_dates() and rollup_new_FCC(). The most efficient way to work with the new FCC data is the public data application programming interface (API). Using the API functionality avoids the laborious task of downloading each data file individually from the website.

To use the FCC API and the functions in this package for the “new” data format, it is necessary need to create an FCC User Registration account and then setup an API key. To create an account, go to https://broadbandmap.fcc.gov/login and then click “Create an account” in the bottom right of the sign-in box.

Once logged into your account, click on your username in the top right corner. Then select the “Manage API Access” option. From this new page, click the “Generate” button to generate a new API token. Copy the token value and save it in a separate location. You will need this key and your username in avail_new_dates() and rollup_new_FCC() to access the database. For more information on creating an API key visit the FCC API Instructions.

Once your account and API key is set up. The function avail_new_dates() can be used to see available dates in the new FCC data format.

avail_new_dates(
  fcc_username = "email@location.com",
  api_key = "longstringofapicharacters"
)

If you already know what date you would like to process you can move right to using rollup_new_FCC(). The rollup functions count the number of unique broadband providers providing internet at or above the speed threshold combinations, up to 5 speed thresholds combinations can be specified. It is also possible to exclude specific broadband technologies or process only some states.

The arguments of rollup_new_FCC() are as follows:

In the following call to rollup_new_FCC(), the June 2022 FCC data is rolled up to the census block level for all states (states = NULL) excluding technology codes 60, and 70. For each census block, the processed data considers 5 threshold combinations for download/upload speeds: 25/3, 25/5, 50/5, 75/10, and 100/100 Mbps. The processed data will count the number of providers within the census block providing broadband at the threshold speeds and excluding the specified technologies. No CSV file is saved with the final dataset.

rollup_new_FCC(
  fcc_username = "email@location.com",
  api_key = "longstringofapicharacters",
  get_year = "2022",
  get_month = "Jun",
  states = NULL,
  geogr = "cb",
  tech_exc = c("60", "70"),
  thresh_down = c(25, 25, 50, 100, 100),
  thresh_up = c(3, 5, 10, 10, 100),
  save_csv = FALSE,
  wd = getwd()
)

Old FCC data format

If you wish to use a time point of FCC data from the "old" format you will need to use the following functions: old_FCC_links(), csv_to_sql_db(), and rollup_old_FCC().

old_FCC_links() returns URL(s)for the download website for the unprocessed data set of interest. If year and month are left NULL then the output data set will contain all available years and months for the old format. Set most_recent to TRUE in order to only get the link for the most recent version of the dataset(s), otherwise all available versions will be output. For example, June 2018 is one time point of the data and there may be multiple versions of one time point as the data is updated.

The following code results in a data set with the URLs for where to download the most recent versions of the June and December 2020 FCC data.

old_FCC_links(year = 2020, month = NULL, most_recent = TRUE)

Based on the FCC data time point you would like to process, copy the link from the old_FCC_links() output, go to the website, and download the US - Fixed with Satellite data set under the "Fixed Broadband Deployment Block Data" header. The code should still work if you decide to work with a state-level data set but the code is originally designed to work with the U.S. data set. Once on the website from old_FCC_links(), the link for the csv file may take you to dropbox. In this case you will need to click on the 3 dots icon on the top bar of the page and then click download.

Next, the FCC CSV file should be added to a SQLite database. To do this, first, create the database connection with dbConnect(). Here, we name the database fcc.sqlite. You can use dbListTables() to check if any tables already exist in the database if you have used the connection previously. Next, in the call to csv_to_sql_db(), we add the FCC data for June 2020 in Alaska to the SQLite database. The name of the table being created in the SQLite database is fcc_ak_2020_Jun, the default table name would not include the Alaska designation. In csv_to_sql_db(), you must set the arguments csv_file and con.

fcc_con <- dbConnect(SQLite(), dbname = "fcc.sqlite")
dbListTables(fcc_con)

csv_to_sql_db(csv_file = "C:/Users/kaile/Downloads/AK-Fixed-Jun2020-v2.csv", 
              con = fcc_con, 
              new_tbl_name = "fcc_ak_2020_Jun",
              year = 2020, month = "Jun",
              pre_process_size = 1000, chunk_size = 50000, 
              show_progress_bar = TRUE)

Once the data is in the SQLite database, we can then process it using rollup_old_FCC(). The output of this function is a data set based on the specifications from the function arguments similar to rollup_new_FCC(). The following call to rollup_old_FCC() processes the June 2020 Alaska FCC data that was just loaded to the SQLite database. The processed data is rolled up to the Census Tract level excluding technology codes 0, 60, and 70 and looks at 5 threshold combinations for download/upload speeds: 25/3, 25/5, 50/5, 75/10, and 100/100 Mbps. The processed data will count the number of providers within the census tract providing broadband at the thresholds. No CSV file is saved with the final dataset.

In the rollup_old_FCC() function, the arguments con and table_in_con are the only two differences from rollup_new_FCC(): con is a SQLite database connection and table_in_con is the table in the con database to process.

processed_dat <- rollup_old_FCC(
    con = fcc_con,
    table_in_con = "fcc_ak_2020_Jun",
    year = 2020, month = "Jun",
    geogr = "ct",
    tech_exc = c("0", "60", "70"),
    thresh_down = c(25, 25, 50, 75, 100),
    thresh_up = c(3, 5, 5, 10, 100)
  )
head(processed_dat)

If you are done working with the SQLite database, be sure to disconnect from the database with dbDisconnect(). You may also want to remove the sqlite, ZIP, and raw CSV files from your working directory because they are considerably large and take up a lot of space.

dbDisconnect(con)
file.remove("fcc.sqlite")


kdmulligan/processFCC documentation built on Oct. 30, 2024, 7:43 p.m.