search_pv: Search PatentsView

Description Usage Arguments Value Examples

Description

This function makes an HTTP request to the PatentsView API for data matching the user's query.

Usage

1
2
3
4
search_pv(query, fields = NULL, endpoint = "patents", subent_cnts = FALSE,
  mtchd_subent_only = TRUE, page = 1, per_page = 25, all_pages = FALSE,
  sort = NULL, method = "GET", error_browser = getOption("pv_browser"),
  ...)

Arguments

query

The query that the API will use to filter records. query can come in any one of the following forms:

  • A character string with valid JSON.
    E.g., '{"_gte":{"patent_date":"2007-01-04"}}'

  • A list which will be converted to JSON by search_pv.
    E.g., list("_gte" = list("patent_date" = "2007-01-04"))

  • An object of class pv_query, which you create by calling one of the functions found in the qry_funs list...See the writing queries vignette for details.
    E.g., qry_funs$gte(patent_date = "2007-01-04")

fields

A character vector of the fields that you want returned to you. A value of NULL indicates that the default fields should be returned. Acceptable fields for a given endpoint can be found at the API's online documentation (e.g., check out the field list for the patents endpoint) or by viewing the fieldsdf data frame (View(fieldsdf)). You can also use get_fields to list out the fields available for a given endpoint.

endpoint

The web service resource you wish to search. endpoint must be one of the following: "patents", "inventors", "assignees", "locations", "cpc_subsections", "uspc_mainclasses", or "nber_subcategories".

subent_cnts

Do you want the total counts of unique subentities to be returned? This is equivalent to the include_subentity_total_counts parameter found here.

mtchd_subent_only

Do you want only the subentities that match your query to be returned? A value of TRUE indicates that the subentity has to meet your query's requirements in order for it to be returned, while a value of FALSE indicates that all subentity data will be returned, even those records that don't meet your query's requirements. This is equivalent to the matched_subentities_only parameter found here.

page

The page number of the results that should be returned.

per_page

The number of records that should be returned per page. This value can be as high as 10,000 (e.g., per_page = 10000).

all_pages

Do you want to download all possible pages of output? If all_pages = TRUE, the values of page and per_page are ignored.

sort

A named character vector where the name indicates the field to sort by and the value indicates the direction of sorting (direction should be either "asc" or "desc"). For example, sort = c("patent_number" = "asc") or
sort = c("patent_number" = "asc", "patent_date" = "desc"). sort = NULL (the default) means do not sort the results. You must include any fields that you wish to sort by in fields.

method

The HTTP method that you want to use to send the request. Possible values include "GET" or "POST". Use the POST method when your query is very long (say, over 2,000 characters in length).

error_browser

The program used to view the HTML error messages sent by the API. This should be one of the following:

  • The string "false" (the default), which turns error browsing off. Instead of viewing the HTML in this case, you will get a portion of the parsed HTML text returned to you as an R error.

  • The name of the program to use to view the HTML error message. If the name of the program is on your PATH, just specify the program name (e.g., error_browser = "chrome"). Otherwise, include the full path to the program.

  • An R function to be called to invoke the browser (e.g.,
    error_browser = rstudioapi::viewer)

  • Under Windows, a value of NULL is allowed and implies that the file association mechanism will be used to determine which browser is used.

...

Arguments passed along to httr's GET or POST function.

Value

A list with the following three elements:

data

A list with one element - a named data frame containing the data returned by the server. Each row in the data frame corresponds to a single value for the primary entity. For example, if you search the assignees endpoint, then the data frame will be on the assignee-level, where each row corresponds to a single assignee. Fields that are not on the assignee-level would be returned in list columns.

query_results

Entity counts across all pages of output (not just the page returned to you). If you set subent_cnts = TRUE, you will be returned both the counts of the primary entities and the subentities.

request

Details of the HTTP request that was sent to the server. When you set all_pages = TRUE, you will only get a sample request. In other words, you will not be given multiple requests for the multiple calls that were made to the server (one for each page of results).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
search_pv(query = '{"_gt":{"patent_year":2010}}')

search_pv(
  query = qry_funs$gt(patent_year = 2010),
  fields = get_fields("patents", c("patents", "assignees"))
)

search_pv(
  query = qry_funs$gt(patent_year = 2010),
  method = "POST",
  fields = "patent_number",
  sort = c("patent_number" = "asc")
)

search_pv(
  query = qry_funs$eq(inventor_last_name = "crew"),
  all_pages = TRUE
)

search_pv(
  query = qry_funs$contains(inventor_last_name = "smith"),
  endpoint = "assignees"
)

search_pv(
  query = qry_funs$contains(inventor_last_name = "smith"),
  config = httr::timeout(40)
)

## Not run: 

# Will view error message in RStudio's viewer pane:
search_pv(
   query = with_qfuns(not(text_any(patent_title = "hi"))),
   error_browser = rstudioapi::viewer
)

## End(Not run)

crew102/patentsview documentation built on May 14, 2019, 11:33 a.m.