Description Usage Arguments Value Examples
Summarize columns of a data frame.
Summarize a data frame df
by a names
character vector of
header names.
1 | summarizeColumns(df, names, naOmit = FALSE)
|
df |
A data frame of patent data. |
names |
a character vector of header names that you want to summarize. |
naOmit |
Logical. Optionally, remove NA values at the end of the summary. Useful when comparing fields that have NA values, such as features. |
A dataframe of summarize values.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | sumo <- cleanPatentData(patentData = patentr::acars, columnsExpected = sumobrainColumns,
cleanNames = sumobrainNames,
dateFields = sumobrainDateFields,
dateOrders = sumobrainDateOrder,
deduplicate = TRUE,
cakcDict = patentr::cakcDict,
docLengthTypesDict = patentr::docLengthTypesDict,
keepType = "grant",
firstAssigneeOnly = TRUE,
assigneeSep = ";",
stopWords = patentr::assigneeStopWords)
# note that in reality, you need a patent analyst to carefully score
# these patents, the score here is for demonstrational purposes
score <- round(rnorm(dim(sumo)[1],mean=1.4,sd=0.9))
score[score>3] <- 3
score[score<0] <- 0
sumo$score <- score
scoreSum <- summarizeColumns(sumo, "score")
scoreSum
# load library(ggplot2) for the below part to run
# ggplot(scoreSum, aes(x=score, y = total, fill=factor(score) )) + geom_bar(stat="identity")
nameAndScore <- summarizeColumns(sumo, c("assigneeClean","score"))
# tail(nameAndScore)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.