clinspacy: Clinical Natural Language Processing using 'spaCy', 'scispaCy', and 'medspaCy'

Performs biomedical named entity recognition, Unified Medical Language System (UMLS) concept mapping, and negation detection using the Python 'spaCy', 'scispaCy', and 'medspaCy' packages, and transforms extracted data into a wide format for inclusion in machine learning models. The development of the 'scispaCy' package is described by Neumann (2019) <doi:10.18653/v1/W19-5034>. The 'medspacy' package uses 'ConText', an algorithm for determining the context of clinical statements described by Harkema (2009) <doi:10.1016/j.jbi.2009.05.002>. Clinspacy also supports entity embeddings from 'scispaCy' and UMLS 'cui2vec' concept embeddings developed by Beam (2018) <arXiv:1804.01486>.

Getting started

Package details

AuthorKarandeep Singh [aut, cre], Benjamin Kompa [aut], Andrew Beam [aut], Allen Schmaltz [aut]
MaintainerKarandeep Singh <>
LicenseMIT + file LICENSE
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:

Try the clinspacy package in your browser

Any scripts or data that you put into this service are public.

clinspacy documentation built on March 20, 2021, 5:06 p.m.