Description Usage Arguments Value Examples
Quick cleanup of characters in a string, typically assignee (company names) and the inventors.
If you have issues with this, you may need to convert to UTF-8 or ASCII.
Use the iconv(thisVector, to="UTF-8")
or to="ASCII"
and it should
fix the problem. See the examples for the code.
This function:
Removes values between spaces, such as (US)
Changes all names to lower case
1 2 | cleanNames(rawNames, firstAssigneeOnly = TRUE, sep = ";",
removeStopWords = TRUE, stopWords = patentr::assigneeStopWords)
|
rawNames |
The character vector you want to clean up |
firstAssigneeOnly |
A logical value, default set to TRUE, keeping only the first assignee if multiple exist. |
sep |
The separating character for multiple assignees, default set to semi-colon. |
removeStopWords |
Logical default TRUE, if want to remove common company stopwords
found in the |
stopWords |
An optional character vector of words you want to remove. Default to
|
A character vector of cleaned up character names.
1 2 3 4 5 6 7 8 9 10 11 12 | assigneeNames <- cleanNames(acars$assignee)
# get a feel for the less-messy data
head(sort(table(assigneeNames), decreasing = TRUE))
# for a messier example, note you need to convert to ASCII/UTF-8 to get rid of errors
# associated with tolower
rawGoogleData <- system.file("extdata", "google_autonomous_search.csv", package = "patentr")
rawGoogleData <- read.csv(rawGoogleData, stringsAsFactors = FALSE, skip = patentr::skipGoogle)
rawGoogleData <- data.frame(lapply(rawGoogleData,
function(x){iconv(x, to = "ASCII")}), stringsAsFactors = FALSE)
assigneeClean <- cleanNames(rawGoogleData$assignee)
head(sort(table(assigneeClean), decreasing = TRUE))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.