beamandrew/cui2vec: cui2vec Word Embeddings for Multimodal Data

We present a new set of embeddings for medical concepts learned using an extremely large collection of multimodal medical data. Leaning on recent theoretical insights, we demonstrate how an insurance claims database of 60 million members, a collection of 20 million clinical notes, and 1.7 million full text biomedical journal articles can be combined to embed concepts into a common space, resulting in the largest ever set of embeddings for 108,477 medical concepts. To evaluate our approach, we present a new benchmark methodology based on statistical power specifically designed to test embeddings of medical concepts.

Getting started

Package details

Maintainer
LicenseMIT + file LICENSE
Version0.0.0.9000
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("remotes")
remotes::install_github("beamandrew/cui2vec")
beamandrew/cui2vec documentation built on Nov. 4, 2019, 7:07 a.m.