generateDocType: Determine the patent document type

Description Usage Arguments Value See Also Examples

View source: R/cleanPatentData.R

Description

Determine the type of document from the patent publication data.

Often times, data exports from publicly available sources do not provide the type of patent document, or, if provided, still requires standardization. By using the kind code, country code, and pre-developed dictionaries for doc length and country code, you can get a great approximation of the types of documents.

Note that you can use View(lens[lens$docType=="NA",]) to view the not-found document types. Often times, these are small countries. You can add to the cakcDict to fix these. They are also useful to ignore if you only want to focus on the larger countries, which are all covered.

Usage

1
2
3

Arguments

officeDocLength

The concat value of country code and number of numerical digits. Extracted using the extractDocLength function.

countryAndKindCode

The concat value of the country code and kind code. Extracted using the extractCountryCode and extractKindCode functions.

cakcDict

A county and kind code dictionary. Default is cakcDict.

docLengthTypesDict

A document length and type dictionary. Default is docLengthTypesDict.

Value

A vector of characters labeling the document type, with NA for when no match was found.

See Also

cakcDict, docLengthTypesDict

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
acars <- acars
acars$pubNum <- extractPubNumber(acars$docNum) # pubnum, ex ####
acars$countryCode <- extractCountryCode(acars$docNum) # country code, ex USAPP, USD
acars$officeDocLength <- extractDocLength(countryCode = acars$countryCode, 
                                         pubNum = acars$pubNum) # cc + pub num length concat
acars$kindCode <- extractKindCode(acars$docNum)
acars$countryAndKindCode <- with(acars, paste0(countryCode, kindCode))
                                         
acars$docType <- generateDocType(officeDocLength = acars$officeDocLength,
                            countryAndKindCode = acars$countryAndKindCode,
                            cakcDict = cakcDict,
                            docLengthTypesDict = docLengthTypesDict)
table(acars$docType)

kamilien1/patentr documentation built on May 20, 2019, 7:19 a.m.