define_gender: Define gender based on first name.

View source: R/utils.R

define_genderR Documentation

Define gender based on first name.

Description

Given a database table tbl, assigns the likely gender of the person given the firstname. The firstname needs to be present as a column in tbl and passed as argument firstname_left.

Usage

define_gender(tbl, conn, firstname_left, drop_missing)

Arguments

tbl

A query from conn with dbplyr and lazily evaluated.

conn

An object of class SQLiteConnection to a sqlite database.

firstname_left

Column containing the firstname in table and to be used for joining gender on.

drop_missing

If TRUE, drops records without clear gender assigned. Clear assignment is when probability of either gender is 0.8 or higher.

Details

The function uses the internal table FirstNamesGender, which assigns the likely gender to each first name. The table is generated from genderize.io.

firstname_left should be free of middle names and middle initials, as otherwise the gender assignment fails (even though using only the firstname would result in a high-confidence assignment.)

Value

tbl augmented by a gender column.

Examples

## Not run: 
new_table <- define_gender(
conn = conn, table = old_table,
firstname_left = "firstname_old", drop_missing = TRUE
)

## End(Not run)


f-hafner/magutils documentation built on Sept. 20, 2023, 5:05 a.m.