email_to_sectors: Probabilistically match emails to an economic sector

Description Usage Arguments Examples

View source: R/email_to_sectors.R

Description

This function assigns entries into a selected economic sector by probabilistically matching common domains and sub-domains like ".edu" and ".co.uk". Matched entries return "misc." with the appropriate sector name in the output column. This function has been integrated as an optional parameter alongside the email_to_orgs() function in the detect_orgs() function.

Usage

1
2
3
4
5
6
7
email_to_sectors(
  data,
  id,
  input,
  output,
  sector = c("academic", "business", "government", "nonprofit")
)

Arguments

data

A data frame or data frame extension (e.g. a tibble).

id

A numeric or character vector unique to each entry.

input

Character vector of emails or email domains that will be matched to the economic sector.

output

Output column to be created as string or symbol.

sector

Sector to match emails. Currently, the only option is "academic" with "business", "government", "household", and "nonprofit" in development.

Examples

1
2
3
4
5
6
library(tidyverse)
library(tidyorgs)
data(github_users)

classified_by_email <- github_users %>%
  email_to_sectors(login, email, organization, academic)

brandonleekramer/tidyorgs documentation built on Dec. 19, 2021, 11:42 a.m.