make_req | R Documentation |
Easily generate a json request with correct schema to be passed to NIH RePORTER Project API
make_req(
criteria = list(fiscal_years = lubridate::year(Sys.Date())),
include_fields = NULL,
exclude_fields = NULL,
offset = 0,
limit = 500,
sort_field = NULL,
sort_order = NULL,
message = TRUE
)
criteria |
list(); the RePORTER Project API query criteria used to filter results (projects). See Details for schema and other spec rules. |
include_fields |
character(); optional; use to return only the specified fields from the result. See Details for valid return field names |
exclude_fields |
character(); optional; use to exclude specified fields from the result. |
offset |
integer(1); optional; default: 0; usually not explicitly passed by user. Used to set the start index of the results to be retrieved (indexed from 0). See Details. |
limit |
integer(1); optional; default: 500; restrict the number of project records returned per page/request inside the calling function. Defaulted to the maximum allowed value of 500. Reducing this may help with bandwidth/timeout issues. |
sort_field |
character(1); optional; use to sort the result by the specified field. May be useful in retrieving complete result sets above the API maximum of 10K (but below 2x the max = 20K) |
sort_order |
character(1): optional; one of "asc" or "desc"; |
message |
logical(1); default: TRUE; print a message with the JSON to console/stdout. You may want to suppress this at times. |
The maximum number of records that can be returned from any result set is 10,000. Also, the maximum record index
in the result set that can be returned is 9,999 - corresponding to the 10,000'th record in the set. These
constraints from the NIH API defy any intuition that the offset
argument would be useful to return records
beyond this 10K limit. If you need to do this, you have two options:
You can break your request into several smaller requests to be retrieved individually. For example,
requesting records for one fiscal year (see: fiscal_years
) at a time. This should be your first path
If you have a result set between 10,001 and 20,000 records, you might try passing essentially the same request
twice, but
varying them by the sort order on some field (and taking care to avoid or remove overlapping results).
See the sort_field
and sort_order
arguments.
criteria
must be specified as a list and may include any of the following (all optional) top level elements:
use_relevance
: logical(1); if TRUE (default), it will sort the most closely matching records per the search
criteria to the top (i.e. the NHI sorts descending according to a calculated match score)
fiscal_years
: numeric(); one or more fiscal years to retrieve projects that correspond to (or started in)
one of the fiscal years entered
include_active_projects
: logical(1); if TRUE (default), adds in active projects without regard for policy_years
pi_names
: list(); API will return records with Project Investigators (PIs) wildcard-matching any of the strings
requested.
If provided, the list must contain three named character vector elements: first_name
,
last_name
, any_name
. Each vector must contain at least one element - use a length-1
vector with an empty string (= ""
or = character(1)
) for any name field you do not wish to search on.
multi_pi_only: logical(1)
; default: FALSE; when multiple pi_names
are matched, setting this value to TRUE
changes the logic from returning project records associated with ANY matched PI name to those associated with ALL names.
po_names
: list(); Project Officers (POs), otherwise same comments as for pi_names
org_names
: character(); one or more strings to filter organization names. The provided string is implicitly
taken to include wildcards at head and tail ends; "JOHN" and "HOP" will both match "JOHNS HOPKINS UNIVERSITY", etc.
org_names_exact_match
: character(); one or more strings to exactly match organization names
pi_profile_ids
: numeric(); one or more project investigator profile IDs; results will match projects associated with
any of the IDs
org_cities
: character(); one or more cities in which associated organizations may be based.
org_states
: character(); one or more US States or Territories (note: requires the abbreviation codes: "NY", "PR", etc.)
in which a project organization may be based.
project_nums
: character(); one or more project numbers (note: the alphanumeric variety of numbers); results will match
any of the specified strings. You may include explicit wildcard operators ("*") in the strings, e.g. "5UG1HD078437-\*"
project_num_split
: list(6); the project_nums
can be broken down to meaningful components which can be searched
individually using this argument.
These component codes are defined here
Your list must contain all of the following named elements:
appl_type_code
: character();
activity_code
: character();
ic_code
: character();
serial_num
: character();
support_year
: character();
suffix_code
: character();
Provide a length-1 vector containing an empty string (=""
or =character(1)
) for any element you do not want to search on
spending_categories
: list(2); a list containing the following named elements:
values
: numeric(): the NIH spending category code. These are congressionally defined and are
available here
match_all
: logical(1); TRUE to return projects found in all categories; FALSE to return projects matching any one
of the categories.
funding_mechanism
: character(); one or more NIH funding mechanism codes used in the president's budget.
Available here
org_countries
: character(); one or more country names; e.g. "United States"
appl_ids
: numeric(); one or more application IDs (note: appl. IDs are natural numbers, unlike project_nums
)
agencies
: character(); one or more of the abbreviated NIH agency/institute/center names, available
here
is_agency_admin: logical(1)
; when specifying associated agencies
, set this value to TRUE
to further specify
that these agencies are administering the grant/project.
is_agency_funding: logical(1)
; when specifying associated agencies
, set this value to TRUE
to further specify
that these agencies are funding the grant/project.
activity_codes: character()
; a 3-character code identifying the grant, contract, or intramural activity through which a project is supported.
This is a more detailed description within each funding mechanism. Codes are available
here
award_types: character()
; (aka Type of Application) one or more grant/application type codes numbered 1-9.
See types here
dept_types: character()
; one or more of NIH standardized department type names (e.g. "PEDIATRICS"). Valid names are provided
here
cong_dists: character()
; one or more US congressional districts (e.g. "NY-20") which the project can be associated with.
See here
foa: character()
; one or more FOA (Funding Opportunity Announcements). Multiple projects may be tied to a single FOA.
See here
project_start_date
: list(2); provide a range for the project start date. Must pass as list containing the following named elements:
from_date
: character(1); string date in %Y-%m-%d format. See ?base::format
for converting from date class.
to_date
: character(1); string date in %Y-%m-%d format.
project_end_date: list(2)
; provide a range for the project end date - similar to project_start_date
.
from_date
: character(1); string date in %Y-%m-%d format. See ?base::format
for converting from date class.
to_date
: character(1); string date in %Y-%m-%d format.
organization_type: character()
; one or more types of applicant organizations (e.g. "SCHOOLS OF MEDICINE"). There does not appear to be a
documented list of valid values, but you can obtain one by pulling all records in a recent year and extracting unique values.
award_notice_date: list(2)
; the award notice date as a range, or you can provide just one of the min/max date, but if you do you must provide the other as an empty string.
from_date
: character(1); string date in %Y-%m-%d format. See ?base::format
for converting from date class.
to_date
: character(1); string date in %Y-%m-%d format.
award_amount_range: list(2)
; a numeric range - if you don't want to filter by this sub-criteria (but are filtering on some other award criteria),
enter 0 for min and 1e9 for max
min_amount: numeric(1)
; a real number between 0 and something very large
max_amount: numeric(1)
; a real number between 0 and something very large
exclude_subprojects: logical(1)
; default: FALSE; related to multiproject research awards, TRUE will limit results to just the parent project.
sub_project_only: logical(1)
; default: FALSE; similar to exclude_subprojects
, this field will limit results to just the subprojects,
excluding the parent.
newly_added_projects_only: logical(1)
; default: FALSE; return only those projects "newly added" (this is left undefined in the official
documentation) to the system.
covid_response: character();
one or more special selector codes used to return projects awarded to study COVID-19 and related topics as funded
and classified according to the below valid values/funding sources:
All
: all COVID-19 projects
Reg-CV
: those funded by regular NIH Appropriated funds
CV
: those funded by the Coronavirus Preparedness and Response Supplemental Appropriations Act, 2020
C3
: those funded by the CARES Act
C4
: those funded by the Paycheck Protection Program and Health Care Enhancement Act
C5
: those funded by the Coronavirus Response and Relief Supplemental Appropriations Act, 2021
C6
: those funded by the American Rescue Plan Act, 2021
full_study_sections: list(6)
; (not documented in API notes) Review activities of the Center for Scientific Review (CSR) are organized into
Integrated Review Groups (IRGs). Each IRG represents a cluster of study sections around a general scientific area. Applications generally are assigned
first to an IRG, and then to a specific study section within that IRG for evaluation of scientific merit.
This gets a bit complicated so we provide this resource for further reading.
If providing this criteria, you must include each of the below named elements as character vectors:
irg_code
: character(); Integrated Review Group
sra_designator_code
: character(); Scientific Review Administrator
sra_flex_code
: character();
group_code
: character();
name
: character();
url
: character();
advanced_text_search
: list(3); used to perform string search in the Project Title ("projecttitle"),
Abstract ("abstract"), and/or Project Terms ("terms") fields.
If providing this criteria, you must include each of the below named elements:
operator: character(1)
; one of "and", "or", "advanced". "and", "or" will be the logical operator between all provided search terms.
"advanced" allows the user to pass a boolean search string directly.
search_field: character()
; can be one or multiple of "abstract", "terms", "projecttitle" passed as a vector of length 1 to 3.
To search all fields, the user can alternatively pass a length 1 character vector containing the string "all" or "".
search_text: character(1)
; pass one or multiple search terms separated by spaces, without any quotations. If searching in
"advanced" mode, provide a boolean search string - you may use parentheses, AND, OR, NOT, and *escaped* double quotes
(e.g. search_text = "(brain AND damage) OR (\"insane in the membrane\") AND cure"
)
Full listing of available field names which can be specified in include_fields
, exclude_fields
, and sort_field
is located here
A standard json (jsonlite
flavor) object containing the valid JSON request string which can
be passed to get_nih_data
or elsewhere
library(repoRter.nih)
## all projects funded in the current (fiscal) year
req <- make_req()
## projects funded in 2019 through 2021
req <- make_req(criteria = list(fiscal_years = 2019:2021))
## projects funded in 2021 where the principal investigator first name is
## "Michael" or begins with "Jo"
req <- make_req(criteria =
list(fiscal_years = 2021,
pi_names =
list(first_name = c("Michael", "Jo*"),
last_name = c(""), # must specify
any_name = character(1) # same here
)
)
)
## all covid-related projects except those funded by American Rescue Plan
## and specify the fields to return, sorting ascending on ApplId column
req <- make_req(criteria =
list(covid_response = c("Reg-CV", "CV", "C3", "C4", "C5")
),
include_fields =
c("ApplId", "SubprojectId", "FiscalYear", "Organization",
"AwardAmount", "CongDist", "CovidResponse",
"ProjectDetailUrl"),
sort_field = "ApplId",
sort_order = "asc")
## using advanced_text_search with boolean search string
string <- "(head AND trauma) OR \"brain damage\" AND NOT \"psychological\""
req <- make_req(criteria =
list(advanced_text_search =
list(operator = "advanced",
search_field = c("terms", "abstract"),
search_text = string
)
)
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.