RandomNames: Generate Random Names

Description Usage Arguments Note Author(s) References Examples

View source: R/random.names.R

Description

The RandomNames function uses data from the Genealogy Data: Frequently Occurring Surnames from Census 1990–Names Files web page to generate a data.frame with random names.

Usage

1
2
RandomNames(N = 100, cat = NULL, gender = NULL, MFprob = NULL,
  dataset = NULL)

Arguments

N

The number of random names you want. Defaults to 100.

cat

Do you want "common" names, "rare" names, names with an "average" frequency, or some combination of these? Should be specified as a character vector (for example, c("rare", "common")). Defaults to NULL, in which case all names are used as the sample frame.

gender

Do you want first names from the "male" dataset, the "female" dataset, or from all available names? Should be specified as a quoted string (for example, "male"). Defaults to NULL, in which case all available first names are used as the sample frame.

MFprob

What proportion of the sample should be male names and what proportion should be female? Specify as a numeric vector that sums to 1 (for example, c(.6, .4)). The first number represents the probability of sampling a "male" first name, and the second number represents the probability of sampling a "female" name. This argument is not used if only one gender has been specified in the previous argument. Defaults to NULL, in which case, the probability used is c(.5, .5).

dataset

What do you want to use as the dataset of names from which to sample? A default dataset is provided that can generate over 400 million unique names. See the "Dataset Details" note for more information.

Note

Dataset Details This function samples from a provided dataset of names. By default, it uses the data from the Genealogy Data: Frequently Occurring Surnames from Census 1990–Names Files web page. Those data have been converted to list named "CensusNames1990" containing three data.frames (named "surnames", "malenames", and "femalenames").

Alternatively, you may provide your own data in a list formatted according to the following specifications (see the "myCustomNames" data in the "Examples*" section). Please remember that R is case sensitive!

Author(s)

Ananda Mahto

References

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Generate 20 random names
RandomNames(N = 20)

# Generate a reproducible list of 100 random names with approximately
#   80% of the names being female names, and 20% being male names.
set.seed(1)
temp <- RandomNames(cat = "common", MFprob = c(.2, .8))
list(head(temp), tail(temp))
table(temp$Gender)

# Cleanup
rm(.Random.seed, envir=globalenv()) # Resets your seed
rm(temp)

# Generate 10 names from the common and rare categories of names
RandomNames(N = 10, cat = c("common", "rare"))

## ===================================== ##
## ======== USING YOUR OWN DATA ======== ##

myCustomNames <- list(
surnames = data.frame(
Name = LETTERS[1:26],
Category = c(rep("rare", 10), rep("average", 10), rep("common", 6))),
malenames = data.frame(
Name = letters[1:10],
Category = c(rep("rare", 4), rep("average", 4), rep("common", 2))),
femalenames = data.frame(
Name = letters[11:26],
Category = c(rep("rare", 8), rep("average", 4), rep("common", 4))))
str(myCustomNames)

RandomNames(N = 15, dataset = myCustomNames)

mrdwab/mrdwabmisc documentation built on May 23, 2019, 7:15 a.m.