knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
So far, R package names.cze can only do one thing: estimate the gender of people living in the Czech Republic from their names. This is a first preview, a more robust solution and more features will come later.
You can install the preview version from GitHub:
devtools::install_github("MarekProkop/names.cze")
Guess the gender of a single name:
library(names.cze) get_sex("Josef") get_sex("Nováková")
The get_sex function is vectorized:
get_sex(c("Jan", "Jana", "Petr", "Petra"))
Using with dplyr:
library(dplyr, quietly = TRUE) df <- tibble( name = c("Jan", "Jitka", "Soňa", "Michal", "Radovan") ) df %>% mutate(sex = get_sex(name))
Using with ggplot2:
library(ggplot2, quietly = TRUE) df %>% ggplot(aes(get_sex(name))) + geom_bar() + labs(x = "gender")
Without additional parameters, the get_sex function returns the gender that is more common among name bearers in the Czech Republic. If you want more certainty, use the threshold parameter, which is a number from 0 to 1 indicating the minimum probability. If the actual probability does not exceed this value (or at least does not equal it), the function returns NA.
get_sex("Rut") get_sex("Rut", threshold = 0.9)
If you have both first and last names stored in one variable, the get_sex function will not work because it only works with one-word names for now. In this case, send only the first name to the function, as this generally has a better predictive value. For example, you can use the word function from the stringr package.
get_sex("Josef Novák") library(stringr, quietly = TRUE) get_sex(word("Josef Novák"))
For debugging purposes, you can use the inspect_name function. It is not vectorized, so you can only pass one name to it.
inspect_name("Václav")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.