cleanParsedUsers: Clean parsed profile scrapes

Description Usage Arguments Value

Description

Given a parsed profile scrapes, containing columns 'marketplace', 'date', 'id', 'profile', remove rows with missing information, or duplicated rows, match to user database and remove non-matches, and extract any available PGP keys.

Usage

1
cleanParsedUsers(parsedUsers, users)

Arguments

parsedUsers

name of dataframe that contains parsed profile scrapes, as described above.

users

name of dataframe with unique user accounts, with columns 'hash_str', 'marketplace', 'id', 'diversity'

Value

1. 'parsedUsers' dataframe with 'profile' column replaced by 'profileClean', which has any PGP keys extracted. 2. 'retrievedPGPs' dataframe with columns 'date', 'vendor_hash' and 'PGPclean' (extracted PGP keys)


xhtai/heisenbrgr documentation built on June 8, 2019, 9:30 a.m.