Description Usage Arguments Examples
View source: R/detect_business.R
This function standardizes messy text data and/or email information to social organizations. The detect_orgs() function iterates through email domains and unstructured text to match patterns in our curated dictionaries to standardize organizations. This tool is designed to optimize pattern detection for in the linkage of multiple datasets, for bibliometric analysis, and for sector classification in social, economic, and policy analysis.
1 2 3 4 5 6 7 8 9 10 |
data |
A data frame or data frame extension (e.g. a tibble). |
id |
A numeric or character vector unique to each entry. |
input |
Character vector of messy or unstructured text that will be matched to organizations from one (or all) of five economic sectors (see sector parameter). |
output |
Output column to be created as string or symbol. |
email |
Optional character vector of email or email domain information. Defaults to FALSE. |
country |
Optional parameter that returns country of organization when available. Defaults to FALSE. |
parent_org |
Optional parameter that returns the parent organization when available. For the academic sector, this value is the school system of the organization. Defaults to FALSE. |
org_type |
Optional parameter that returns organization type when available. Current return values include "Public", "Private for-profit", and "Private not-for-profit". Defaults to FALSE. |
1 2 3 4 5 6 | library(tidyverse)
library(tidyorgs)
data(github_users)
classified_users <- github_users %>%
detect_business(login, company, organization, email, parent_org, org_type)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.