automatic_NLP_processor: Process NLP Annotations on the Current Patient Cohort

View source: R/text_processing.R

automatic_NLP_processorR Documentation

Process NLP Annotations on the Current Patient Cohort

Description

Accepts a list of patient ID's or alternatively can perform NLP annotations on all available patients in the database.

Usage

automatic_NLP_processor(
  patient_vect = NA,
  text_format = "latin1",
  nlp_engine = "udpipe",
  uri_fun = mongo_uri_standard,
  user,
  password,
  host,
  replica_set,
  port,
  database,
  max_n_grams_length = 7,
  negex_depth = 6,
  select_cores = NA,
  URL = NA
)

Arguments

patient_vect

Vector of patient ID's. Default is NA, in which case all available patient records will undergo NLP annotation.

text_format

Text format for NLP engine.

nlp_engine

Which NLP engine should be used? UDPipe is the only one supported for now.

uri_fun

Uniform resource identifier (URI) string generating function for MongoDB credentials.

user

MongoDB user name.

password

MongoDB user password.

host

MongoDB host server.

replica_set

MongoDB replica set, if indicated.

port

MongoDB port.

database

MongoDB database name.

max_n_grams_length

Maximum length of tokens for matching with UMLS concept unique identifiers (CUI's). Shorter values will result in faster processing. If 0 is chosen, UMLS CUI tags will not be provided.

negex_depth

Maximum distance between negation item and token to negate. Shorter distances will result in decreased sensitivity but increased specificity for negation.

select_cores

How many CPU cores should be used for parallel processing? Max allowed is total number of cores minus one. If 1 is entered, parallel processing will not be used.

URL

UDPipe model URL.

Value

Confirmation that requested operation was completed, or error message if attempt failed.

Examples

## Not run: 
automatic_NLP_processor(patient_vect = NA, text_format = 'latin1', nlp_engine = 'udpipe',
URL = 'models/english-ewt-ud-2.4-190531.udpipe', uri_fun = mongo_uri_standard, user = 'John',
password = 'db_password_1234', host = 'server1234', port = NA, database = 'TEST_PROJECT',
max_n_grams_length = 7, negex_depth = 6, select_cores = 1)

## End(Not run)

simon-hans/CEDARS documentation built on Feb. 14, 2024, 3:16 a.m.