strr_process_daily: Function to process raw daily STR tables into UPGo format

View source: R/strr_process_daily.R

strr_process_dailyR Documentation

Function to process raw daily STR tables into UPGo format

Description

strr_process_daily takes raw daily tables from AirDNA and cleans them to prepare for compression into the UPGo database storage format (using strr_compress).

Usage

strr_process_daily(daily, property, keep_cols = FALSE, quiet = FALSE)

Arguments

daily

An unprocessed daily table in the raw AirDNA format, with either ten or six fields.

property

A property table processed in the UPGo style.

keep_cols

A logical scalar. If the 'daily' table has 10 fields, should the superfluous 4 fields be kept, or should the table be trimmed to the 6 fields which UPGo uses (default)?

quiet

A logical scalar. Should the function execute quietly, or should it return status updates throughout the function (default)?

Details

A function for cleaning raw daily activity tables from AirDNA and preparing them for compression into the UPGo format. The function also produces error files which identify possible corrupt or missing lines in the input file.

The function expects the input daily file to have either ten fields (the default for a raw table from AirDNA) or six fields (the default for UPGo, after the "Price (Native)", "Currency Native", "Airbnb Property ID", and "HomeAway Property ID" fields are removed on import).

The function expects the input property file to be formatted in the UPGo style, in particular with fields named "property_ID", "created", and "scraped". Eventually function arguments may be supplied to allow these field names to be overruled.

Because the expectation is that the input files will be very large, the function uses updating by reference on the daily input table to change it to data.table class prior to processing. This saves a considerable amount of memory by avoiding making an unnecessary copy of the input daily table, but has the side effect of the initial input file being changed to a data.table.

Value

A list with four elements: 1) the processed daily table, ready for compression with strr_compress; 2) a processed daily_inactive table, containing the rows which fall outside a listing's active period (as determined by 'created' and 'scraped' fields in the property table), ready for compression with strr_compress; 3) an error table identifying corrupt or otherwise invalid row entries; 4) a missing_rows table identifying property_IDs with missing dates in between their first and last date entries, and therefore potentially missing data.


UPGo-McGill/strr documentation built on Feb. 24, 2024, 6:15 p.m.