extractDocLength: Get a code for length of doc and country code

Description Usage Arguments Value Examples

Description

Generate a custom concatenation of country code and length of the publication number, for document type identification purposes.

Given limited metadata available on free sites, often times the downloaded data set does not include the type of patent document. There are two easy ways to discover the type of a patent document. A dictionary stored with the package can compare the output to match up the type of patent document.

  1. The kind code, if present, is typically the same for each country. B is usually a patent and A is usually an application.

  2. The length of the publication number, along with the country code, is another great indicator. Applications in USA have 11 numbers, and, for now, 9 numbers for granted patents.

Usage

1
extractDocLength(countryCode, pubNum)

Arguments

countryCode

A string vector of country codes

pubNum

A string vector of the numeric portion of a publication number.

Value

A string vector of concatenated country code and publication number length, such as US11 or EP9.

Examples

1
2
3
4
5
acars$pubNum <- extractPubNumber(acars$docNum)
acars$countryCode <- extractCountryCode(acars$docNum)
acars$officeDocLength <- extractDocLength(countryCode = acars$countryCode,
pubNum = acars$pubNum)
head(acars[,c("officeDocLength","docNum")])

kamilien1/patentR documentation built on May 20, 2019, 7:19 a.m.