ML4LHS/clinspacy: Clinical Natural Language Processing using 'spaCy', 'scispaCy', and 'medspaCy'

Performs biomedical named entity recognition, Unified Medical Language System (UMLS) concept mapping, and negation detection using the Python 'spaCy', 'scispaCy', and 'medspaCy' packages, and transforms extracted data into a wide format for inclusion in machine learning models. The development of the 'scispaCy' package is described by Neumann (2019) <doi:10.18653/v1/W19-5034>. The 'medspacy' package uses 'ConText', an algorithm for determining the context of clinical statements described by Harkema (2009) <doi:10.1016/j.jbi.2009.05.002>. Clinspacy also supports entity embeddings from 'scispaCy' and UMLS 'cui2vec' concept embeddings developed by Beam (2018) <arXiv:1804.01486>.

Getting started

Package details

Maintainer
LicenseMIT + file LICENSE
Version1.0.2.9000
URL https://github.com/ML4LHS/clinspacy
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("remotes")
remotes::install_github("ML4LHS/clinspacy")
ML4LHS/clinspacy documentation built on Aug. 23, 2021, 8:47 p.m.