knitr::opts_chunk$set(results = 'asis', echo = FALSE, comment = FALSE,  warning = FALSE, message = FALSE, fig.align = 'center')
library(knitr)
library(here)
library(EndoMineR)
library(DiagrammeR)
library(CodeDepends)

Introduction

Patients with Barrett's oesophagus (BE) undergo regular endoscopic surveillance with a view to earlier oesophageal adenocarcinoma detection. Quality monitoring of this programme relies on manual extraction which, given its laboriousness, is a significant hindrance to robust, large scale and reproducible quality monitoring.

EndoMineR, an open source package written in R, has been developed specifically to automate the extraction of data from endoscopic and associated pathology reports^1^.It contains functions to clean, format and extract elements from free text and perform quality metrics for a range of conditions including in BE.

Aim

We decided to assess the accuracy of the BE extraction algorithms within EndoMineR, for both endoscopic and pathological elements of BE using only pathology reports as input. This is the 'worst case scenario' input data. The functions being assessed were: 1. The extraction of a Prague score, 2. The extraction of the worst pathology grade, 3. The site of biopsied tissue, 4. The site and type of any therapy in the upper GI tract.

Import the data

load(file=here("data", "TheOGDReportFinal.rda"))
load(file=here("data", "PathDataFrameFinal.rda"))
read_chunk(here("inst","TemplateProject","munge", "PreProcessing.R"))






Methods

160 patient episodes between x and ` with full text pathology data only were acquired from x departments in central London as a training set. Validation was performed on a further 100 pathology reports. The therapy algorithm was performed on a further 100 reports. You can even use the bibliography the same way [@turnerControlsWaterBalance2014].


Show the Consort diagram


Show the CodeDepends


library(kableExtra)
knitr::kable(iris[1:5, 1:4], caption = 'Table caption.',align = 'c',"html") %>% 
  kable_styling(full_width = TRUE)

Results Reports were written by x different pathologists. The readability index of all the text, using the Fleisch-Kincaid readability index was xindicating an average grammatical complexity. The results are displayed in Table 1.Sensitivity was excellent for all algorithms especially given the difficult input text. A reduction in specificity in the detection of worst pathology occurred because of dual reporting of colonoscopy and gastroscopy tissue which also affected the sensitivity of the Pathology Site detection. A variability in how intestinalisation was reported also affected the specicificity.

Usually you want to have a nice table displaying some important results that you have calcualated. In posterdown this is as easy as using the kable table formatting you are probably use to as per typical RMarkdown formatting. I suggesting checking out the kableExtra package and its in depth documentation on customizing these tables found here [@kableExtra2019]. Hopfully I can make this with an inline refernce like, Table \@ref(tab:mytable).

Conclusion 1. Reproducible extraction of BE parameters can be done from semi-structured text. 2. Further improvements using parts of speech tagging and term mapping will improve the results. 3. Such data extraction will allow for upstream automation of quality monitoring and governance and novel metrics in BE as well as other gastroenterological conditions.

References 1. Zeki S, (2018). EndoMineR for the extraction of endoscopic and associated pathology data from medical reports. Journal of Open Source Software, 3(24), 701

References



sebastiz/PhysiMineR documentation built on Oct. 3, 2023, 3:46 p.m.