read_many_shop: Read and row-bind many files from the SafeGraph Shop
In SafeGraphInc/SafeGraphR: Package for Processing and Analyzing SafeGraph Data

read_many_shop

R Documentation

Read and row-bind many files from the SafeGraph Shop

Description

This accepts a directory. It will use read_shop to load every zip in that folder, assuming they are all files downloaded from the SafeGraph shop. It will then row-bind together each of the subfiles, so you'll get a list where one entry all the normalization data row-bound together, another is all the patterns files, and so on. . Note that after reading in data, if gen_fips = TRUE, state and county names can be merged in using data(fips_to_names).

Usage

read_many_shop(
  dir = ".",
  recursive = FALSE,
  filelist = NULL,
  start_date = NULL,
  keeplist = c("patterns", "normalization_stats.csv", "home_panel_summary.csv",
    "visit_panel_summary.csv", "brand_info.csv"),
  exdir = dir,
  cleanup = TRUE,
  by = NULL,
  fun = sum,
  na.rm = TRUE,
  filter = NULL,
  expand_int = NULL,
  expand_cat = NULL,
  expand_name = NULL,
  multi = NULL,
  naics_link = NULL,
  select = NULL,
  gen_fips = FALSE,
  silent = FALSE,
  ...
)

Arguments

`dir`	Name of the directory the files are in.
`recursive`	Look for files in all subdirectories as well.
`filelist`	Optionally specify only a subset of the filename to read in.
`start_date`	A vector of dates giving the first date present in each zip file, to be passed to `read_patterns` giving the first date present in the file, as a date object. When using `read_many_shop` this really should be included, since the patterns file names in the shop files are not in a format `read_patterns` can pick up on automatically. If left unspecified, will produce an error. To truly go ahead unspecified, set this to `FALSE`.
`keeplist, exdir, cleanup`	Arguments to be passed to `read_shop`, specified as in `help(read_shop)`.
`by, fun, na.rm, filter, expand_int, expand_cat, expand_name, multi, naics_link, select, gen_fips, silent, ...`	Other arguments to be passed to `read_patterns`, specified as in `help(read_patterns)`.

Examples


## Not run: 
# In the working directory we have two shop ZIP files, one for March and one for April.
mydata <- read_shop(# I only want some of the sub-files
                    keeplist = c('patterns','home_panel_summary.csv'),
                    # For patterns, only keep these variables
                    select = c('raw_visit_counts', 'region', 'bucketed_dwell_times', 'location_name'),
                    # I want two aggregations of patterns - one of total visits by state ('region')
                    # and another by location_name that has the dwell times for each brand
                    multi = list(
                      list(name = 'all',
                           by = 'region'),
                      list(name = 'location_dwells',
                           by = 'location_name',
                           expand_cat = 'bucketed_dwell_times',
                           expand_name = 'bucketed_times')
                      ),
                    # Be sure to specify start_date for read_shop
                    start_date = c(lubridate::ymd('2020-03-01'),lubridate::ymd('2020-04-01')))

# The result is a list with two items- patterns and home_panel_summary.csv
# patterns itself is a list with two data.tables inside - 'all' and 'location_name',
# aggregated as given.


## End(Not run)

SafeGraphInc/SafeGraphR documentation built on Nov. 25, 2022, 11:20 a.m.