cleanParsedItems: Clean parsed item scrapes

Description Usage Arguments Value

Description

Given a parsed item scrapes, containing columns 'marketplace', 'date', 'seller_id', 'title', 'listing_description', remove rows with missing information, or duplicated rows, match to items database and remove non-matches, and extract any available PGP keys. This is analogous to 'cleanParsedUsers()'.

Usage

1
cleanParsedItems(parsedItems, items)

Arguments

parsedItems

name of dataframe that contains parsed item listing scrapes, as described above.

items

name of dataframe with unique item listings, with at least columns 'hash_str', 'marketplace', 'title', 'vendor'

Value

1. 'parsedUsers' dataframe with 'listing_description' column replaced by 'descriptionClean', which has any PGP keys extracted. 2. 'retrievedPGPs' dataframe with columns 'date', 'vendor_hash' and 'PGPclean' (extracted PGP keys)


xhtai/heisenbrgr documentation built on June 8, 2019, 9:30 a.m.