addAgeGender: Function to enrich a filtered corpus with Twitter users' most...
In jeroenclaes/tweetCorp: tweetCorp: A package to work with Twitter corpora

Description Usage Arguments Value

This function crossreferences the 'name' field in the corpus files with a large database of baby names statistics, drawn from two sources: United States Social Security (included in the R package 'babynames' by Hadley Wickham) and the Spanish Instituto Nacional de Estadisticas (INE). The function implements a cascade system, attempting first to find exact matches, after which it results to approximate string matching using Levenhstein distance.

1 2	addAgeGender(filtered_corpus, language = c("English", "Spanish"), maxDistance = 1, nthreads = parallel::detectCores())

`maxDistance`	maximum Levenhstein distance to use for approximate string matching. Defaults to 2
`nthreads`	number of threads to use in the C++ code for approximate string matching. Defaults to the number of CPU cores on your machine and it's probably a good idea to use that default.
`filteredCorpus`	filtered corpus. Do not use on unfiltered data if you want to get results in this century.