knitr::opts_chunk$set(echo = TRUE)
Authors: Brandon Kramer with contributions from members of the University of Virginia's Biocomplexity Institute, the National Center for Science and Engineering Statistics, and the 2020 and 2021 UVA Data Science for the Public Good Open Source Software Teams
License: MIT
You can install this package using the devtools
package:
install.packages("devtools") devtools::install_github("brandonleekramer/tidyorgs")
The tidyorgs
package provides several functions that help standardize messy text data for organizational analysis. More specifically, the package's two core sets of functions detect_{sector}()
and email_to_orgs()
standardize organizations from across the academic, business, government and nonprofit sectors based on unstructured text and email domains. The package is intended to support linkage across multiple datasets, bibliometric analysis, and sector classification for social, economic, and policy analysis.
detect_orgs()
functionThe detect_{sector}()
functions detects patterns in messy text data and then standardizes them into organizations based on a curated dictionary. For example, messy bio information scraped from GitHub can be easily codified so that statistical analysis can be done on academic users.
detect_academic()
library(tidyverse) library(tidyorgs) data(github_users) classified_academic <- github_users %>% detect_academic(login, company, organization, email) %>% filter(academic == 1) %>% select(login, organization, company) classified_academic
detect_business()
classified_businesses <- github_users %>% detect_business(login, company, organization, email) %>% filter(business == 1) %>% select(login, organization, company) classified_businesses
detect_government()
classified_government <- github_users %>% detect_government(login, company, organization, email) %>% filter(government == 1) %>% select(login, organization, company) classified_government
detect_nonprofit()
classified_nonprofit <- github_users %>% detect_nonprofit(login, company, organization, email) %>% filter(nonprofit == 1) %>% select(login, organization, company, email) classified_nonprofit
email_to_orgs()
For those that only have email information, the email_to_orgs()
function matches users to organizations based on our curated domain list.
user_emails_to_orgs <- github_users %>% email_to_orgs(login, email, country_name, "academic") github_users %>% left_join(user_emails_to_orgs, by = "login") %>% drop_na(country_name) %>% select(email, country_name)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.