See ImpedanceQuestions/AcidReflux template

knitr::opts_chunk$set(results = 'asis', echo = FALSE, comment = FALSE,  warning = FALSE, message = FALSE, fig.align = 'center')
library(knitr)
library(here)
library(EndoMineR)
library(DiagrammeR)
library(CodeDepends)
library(readxl)
library(PhysiMineR)
library(dplyr)
library(stringr)

Introduction

Patients with suspected gastrooesophageal reflux disease (GORD) often undergo pH impedance and manometry if there is diagnostic doubt or as a work-up for anti-reflux procedures. pH impedance is usually performed as a 24 hour test and therefore may be negative in patients with intermittent acid reflux. In cases of clinical suspicion the patient may undergo a longer period of acid reflux monitoring which can be performed with an endoscopically inserted wireless capsule (BRAVO TM). It is unclear which patients with negative pH impedance should be selected for further testing and it may be that high resolution (HRM) parameters can give us insight as to who is at an increased risk having GORD even if undetected by pH impedance.The aim of the current study was to determine variables from either pH impedance or high resolution manometry, in cases where the pH impedance was negative, that may predict a pH impedance study that is positive for GORD.

Aim

read_chunk(here("inst","Projects","BRAVOStudies","HRMPredictorsOfAllPosBRAVO","munge", "PreProcessing.R"))

Import the data


Pre-process both the pathology and the endoscopy text:


Merge endoscopy and pathology text


Clean the endoscopist and instrument columns


Extract number of biopsies, biopsy size, biopsy location and medications from the text


Extract endoscopy events


Extract some generic analyses


Extract some disease specific analyses


Show the Consort diagram


Methods

All variables from HRM and pH impedance results of patients who had undergone studies at Guy's and St Thomas' Foundation NHS Trust was extracted Study Design and Patient Population The study population consisted of adult patients (18 years of age) with refractory heartburn or suspected EER symptoms (asthma, cough, or hoarseness) referred to the Esophageal Motility Center at the Vanderbilt University Medical Center (Nashville, TN) and had failed > 8 week course of high dose (twice-daily) PPI therapy. Excluded were patients with a history of fundoplication, radiation, or malignancy. The external validation cohort obtained from the Mayo Clinic (Rochester, MN) and the Washington University in St Louis (St Louis, MO) used these same inclusion and exclusion criteria. All patients underwent esophagogastroduodenoscopy (EGD) and 48-hour wireless pH monitoring by a single provider . Data were prospectively collected and included patient characteristics (age, sex, race, and body mass index [BMI]), presence of secondary GERD symptoms (heartburn with or without regurgitation) in patients with primary extraesophageal symptoms, and current acidsuppressive medication regimen (or no treatment). The presence and size of a hiatal hernia and the presence and severity of esophagitis (graded by the Los Angeles Classification as A, B, C, or D) were determined with endoscopy.

$\color{red}{\text{F.Define the Location}}$.

Wireless pH Monitoring Patients stopped taking PPIs and H2-receptor antagonists at least 7 days before undergoing 48-hour wireless ambulatory pH monitoring (Given Imaging, Duluth, GA). Patients underwent EGD for visual anatomic inspection (for presence and size of hiatal hernia and evaluation of esophageal mucosal integrity for esophagitis) and distance measurements from the incisors to the squamocolumnar junction. Wireless capsules were calibrated by submersion in buffer solutions at pH 7.0 and 1.0 and then activated by magnet removal. Capsules were placed 6 cm above the squamocolumnar junction and attached with vacuum suction using the manufacturer’s delivery system. Placement was confirmed with endoscopy. Subsequently, patients wore wireless pH recorders on their waists or kept them within 3–5 feet at all times. Recording devices received pH data sampling transmitted by the capsule at 433 Hz with 6-second sampling intervals during the 48-hour study. Patients were instructed to perform their normal daily activities and dietary practices.13 After study completion, data were downloaded to dedicated computers. Patient-recorded symptoms were entered into the computer-based record. Measurements of total, upright, and supine percentage of time when esophageal pH was below 4 were determined over days 1 and 2 of the wireless study. Total acid exposure time (percentage of total time with pH <4) greater than 5.5% (mild reflux) and 12% (moderate to severe reflux) were used as cutoffs based on the average of 2 days.14 A 12% cutoff was used to define moderate to severe reflux based on prior surgical data showing that 12% predicted positive fundoplication outcomes at 1 year in patients with EER symptoms.15 This is the only currently available study in the literature that has evaluated acid exposure time and surgical outcome in this population.

The results of high resolution studies performed in patients with cough as part of the presenting complaint to the gastroenterology service at Guy's and St Thomas' NHS Trust were examined retrospectively between r paste(min(lubridate::month(ImpAndHRM$VisitDate.x, label = TRUE, abbr = FALSE)),min(lubridate::year(ImpAndHRM$VisitDate.x))) and r paste(max(lubridate::month(ImpAndHRM$VisitDate.x, label = TRUE, abbr = FALSE)),max(lubridate::year(ImpAndHRM$VisitDate.x)))

$\color{red}{\text{F.Define the Eligibility}}$. All patients selected were adults over the age of 18. Ethics was approved (IRAS number) and by the local ethics board.

$\color{red}{\text{F.Define the Exclusion}}$. In cases where there was more than one pH impedance or HRM result, either test was matched with the other test performed most recently to prevent duplication or contradictory results. Only patients who had both HRM and pH impedance were included. A total of r nrow(unique(ImpAndHRM$HospNum_Id)) studies were retrieved (3562 individual patients)

$\color{red}{\text{G.Define the Sampling}}$.

Patients were selected according to Chicago Classification v3. Patients symptoms were selected from self reporting at impedance if this was done at the same time as high resolution manometry. If this was not the case then the symptoms were extracted either from the final report summarising the patient's condition and physiological diagnosis, or from the clinicians details when ordering the investigation. The impedance study results were classified as positive for acid reflux if the following thresholds were positive: Acid exposure total, recumbent or upright percent time exceeded 4.2%,1.2% or 6.3% respectively.

$\color{red}{\text{H.Define the data cleaning methodology methods}}$.

Show the CodeDepends


Quantitative data were presented as mean±SD. Data transformation by square root was applied to the operative time and blood loss during surgery, to meet the normality requirement. Operative time and blood loss were standardized by minusing their mean and then dividing by their SD.17 Student t test or χ2 test was applied to examine the difference of each variable as indicated. Linear regression was utilized to determine the relationships between the patients’ variables and endpoints (operative time, blood loss during surgery), whereas logistic regression was applied for postoperative morbidity. After univariate analysis, variables with a P<0.25 were selected for multivariate analysis. A multivariate analysis was performed using a multiple linear regression or logistic model, with a stepwise (forward selection/backward elimination) method. The statistical analysis was performed using SAS software (SAS Institute Inc., Cary, NC), and P<0.05 was considered to be significant.The strength of the corresponding associations was estimated by using univariate analysis and multivariate logistic regression analysis, and expressed as odds ratio (OR)

# Set up for the univariate analysis:
NegImpAndHRMNoMissingFinal<-ImpAndHRMNoMissingFinal[ImpAndHRMNoMissingFinal$AcidReflux==1,]
PosImpAndHRMNoMissingFinal<-ImpAndHRMNoMissingFinal[ImpAndHRMNoMissingFinal$AcidReflux==0,]

#Create the univariate table by selection from summary statistics using the psych package:

library(psych)
A<-describe(ImpAndHRMNoMissingFinal)
A$n<-paste0(round(A$mean,digits=2)," (sd: ",round(A$sd,digits=1),")")
colnames(A) <- paste("All_Pat",colnames(A), sep = "_")

P<-describe(PosImpAndHRMNoMissingFinal)
P$n<-paste0(round(P$mean,digits=2)," (sd: ",round(P$sd,digits=1),")")
colnames(P) <- paste("Pos_Pat",colnames(P), sep = "_")

N<-describe(NegImpAndHRMNoMissingFinal)
N$n<-paste0(round(N$mean,digits=2)," (sd: ",round(N$sd,digits=1),")")
colnames(N) <- paste("Neg_Pat",colnames(N), sep = "_")

ss<-cbind(A,P,N)
ss<-ss%>%select(All_Pat_n,Pos_Pat_n,Neg_Pat_n)


#Add the n values to show the total numbers of patients in each group:
df<-data.frame(nrow(ImpAndHRMNoMissingFinal),nrow(PosImpAndHRMNoMissingFinal),nrow(NegImpAndHRMNoMissingFinal))
names(df)<-names(ss)
ss <- rbind(df, ss)

#Add each of the p-values for each comparison
IntraabdominalLESlengthcmttest<-t.test(NegImpAndHRMNoMissingFinal$IntraabdominalLESlengthcm,PosImpAndHRMNoMissingFinal$IntraabdominalLESlengthcm,paired=FALSE)
Hiatalherniattest<-t.test(NegImpAndHRMNoMissingFinal$Hiatalhernia,PosImpAndHRMNoMissingFinal$Hiatalhernia,paired=FALSE)
LOS_relaxttest<-chisq.test(ImpAndHRMNoMissingFinal$LOS_relax, ImpAndHRMNoMissingFinal$AcidReflux)
BasalrespiratoryminmmHgttest<-t.test(NegImpAndHRMNoMissingFinal$BasalrespiratoryminmmHg,PosImpAndHRMNoMissingFinal$BasalrespiratoryminmmHg,paired=FALSE)
ResidualmeanmmHgttest<-t.test(NegImpAndHRMNoMissingFinal$ResidualmeanmmHg,PosImpAndHRMNoMissingFinal$ResidualmeanmmHg,paired=FALSE)
DistalcontractileintegralmeanmmHgcmsttest<-t.test(NegImpAndHRMNoMissingFinal$DistalcontractileintegralmeanmmHgcms,PosImpAndHRMNoMissingFinal$DistalcontractileintegralmeanmmHgcms,paired=FALSE)
IntraboluspressureATLESRmmHgttest<-t.test(NegImpAndHRMNoMissingFinal$IntraboluspressureATLESRmmHg,PosImpAndHRMNoMissingFinal$IntraboluspressureATLESRmmHg,paired=FALSE)
Distallatencyttest<-t.test(NegImpAndHRMNoMissingFinal$Distallatency,PosImpAndHRMNoMissingFinal$Distallatency,paired=FALSE)
failedChicagoClassificationttest<-t.test(NegImpAndHRMNoMissingFinal$failedChicagoClassification,PosImpAndHRMNoMissingFinal$failedChicagoClassification,paired=FALSE)
panesophagealpressurizationttest<-t.test(NegImpAndHRMNoMissingFinal$panesophagealpressurization,PosImpAndHRMNoMissingFinal$panesophagealpressurization,paired=FALSE)
largebreaksttest<-t.test(NegImpAndHRMNoMissingFinal$largebreaks,PosImpAndHRMNoMissingFinal$largebreaks,paired=FALSE)
prematurecontractionttest<-t.test(NegImpAndHRMNoMissingFinal$prematurecontraction,PosImpAndHRMNoMissingFinal$prematurecontraction,paired=FALSE)
rapidcontractionttest<-t.test(NegImpAndHRMNoMissingFinal$rapidcontraction,PosImpAndHRMNoMissingFinal$rapidcontraction,paired=FALSE)
smallbreaksttest<-t.test(NegImpAndHRMNoMissingFinal$smallbreaks, PosImpAndHRMNoMissingFinal$smallbreaks,paired=FALSE)

df<-c("-",format.pval(IntraabdominalLESlengthcmttest$p.value),format.pval(Hiatalherniattest$p.value),
      format.pval(LOS_relaxttest$p.value),
      format.pval(BasalrespiratoryminmmHgttest$p.value),
      format.pval(ResidualmeanmmHgttest$p.value),
      format.pval(DistalcontractileintegralmeanmmHgcmsttest$p.value),
      format.pval(IntraboluspressureATLESRmmHgttest$p.value),
      format.pval(Distallatencyttest$p.value),
      format.pval(failedChicagoClassificationttest$p.value),
      format.pval(panesophagealpressurizationttest$p.value),
      format.pval(largebreaksttest$p.value),
      format.pval(prematurecontractionttest$p.value),
      format.pval(rapidcontractionttest$p.value),
      format.pval(smallbreaksttest$p.value),"N/A")

#Add the column to the table:

ss<-cbind(ss,df)

#Rename the columns

names(ss)<-c("All","GORD Positive","Gord Negative","p value")

#Rename the rows:
rownames(ss)<-c("n","Intraabdominal LES length (cm)","Hiatal hernia","LOS relax","Basal respiratory min (mmHg)","Residual mean (mmHg)","Distal contractile integralmean (mmHg/cm/s)","Intrabolus pressure @ LESR(mmHg)","Distal latency","failedPeristalsis(%)","Panesophageal pressurization(%)","Large breaks(%)","Premature contraction(%)","Rapid contraction(%)","small breaks(%)","Acid Reflux")

knitr::kable(ss)

Stepwise multiple logistic regression analysis was used to identify significant predictor variables of a major final diagnosis. The prediction model was built using SAS stepwise logistic regression analysis on the exploratory sample population with an entry criterion of p<0.3. The stepwise procedure added the independent variables to the model one at a time. In the final model variables were removed if the retention criterion of p⩽0.05 was not met. The study population was randomly divided into an exploration group (n=1908) and a validation group (n=1907). Thirteen predictor variables (age, sex, ASA classification of comorbidity, bleeding, vomiting, heartburn, dysphagia, weight loss, early satiety, chest pain, odynophagia, anaemia, and feeding problems) were included in the model building process. Interaction among the predictor variables was analysed using the Breslow day test for homogeneity.20 Once the model was established using the exploratory group, the parameter estimates were applied to the validation group to test the predictive accuracy of the model. The predictive value of the model was assessed with a receiver operating characteristic (ROC) curve. The curve represents the relationship between sensitivity and specificity for the prediction of a major final diagnosis.

Because prediction rules based on logistic regression are often too complex for routine clinical use, we also developed a simplified clinical prediction rule. The variables specified in the logistic regression model were evaluated using a univariate logistic analysis to justify inclusion in the simplified model. The simplified rule corresponds to current consensus recommendations on endoscopic evaluation of dyspepsia. All patients aged ⩾45 years or with any “alarm” symptoms (those identified as risk factors in the multivariate model) were considered to have a “positive” indication. The sensitivity and specificity of the simplified rule was determined by comparison with the actual endoscopic findings among those with and without a “positive” indication. We also varied the age cut off from 30 to 70 years and determined the sensitivity and specificity at each cut off.

After assessment of colinearity and exclusion of significantly correlated variables, the following variables were chosen for further analysis from the HRM report: Intraabdominal LES length,presence of a Hiatal hernia, Basal respiratory minimum pressure, Residual mean pressure, Distal contractile integral, Intrabolus pressure, Distal latency, failed peristalsis percentage, panesophageal pressurization percentage, large breaks percentage, premature contraction percentage, rapid contraction percentage and small breaks percentage. The dependent variable was the presence of acid reflux on impedance.

Statistical Analysis There was strict control and supervision of the data entry and access for this study and all data were collected and stored at the secure web-based Vanderbilt Digestive Disease Center REDCap (Research Electronic Data Capture; 1 UL1 RR024975 NCRR/NIH). Categorical variables were summarized using percentages and continuous variables were summarized using median and interquartile range (25th–75th percentile). Wilcoxon rank sum and Pearson c2 tests were used to compare continuous and categorical variables between groups, respectively. Separate multivariable proportional odds ordinal logistic regression models were developed for heartburn or EER symptoms to determine factors associated with abnormal pH. The 2 models were constructed by (1) pre-specifying a set of easily obtained demographic and clinical predictors, (2) fitting a multivariable model with these predictors, and (3) using a single backward selection step to remove insignificant predictors (P < .05) to simplify the model. For the heartburn model, the predictors age, BMI, hiatal hernia, and regurgitation were initially inputted. The EER model mirrored the heartburn model except for the addition of cough or hoarseness and asthma. Age and regurgitation were not significant in either model and were excluded. The model was based on continuous variables and then simplified to a scoring system that could be used at bedside. BMI was modeled flexibly as a continuous variable in the initial predictive model. We subsequently used a cutoff of BMI >25 kg/m2 in the final model because this corresponded to a similar percentage of increase in the probability of abnormal pH as the other binary predictors in the scoring system (heartburn and asthma). All analyses were conducted using the R statistical program using principles for reproducible research.16 We subsequently validated the performance of the prediction model using an external dataset from 2 tertiary care medical centers (Mayo Clinic and Washington University in St Louis)

library(kableExtra)
library(MASS)
library(epiDisplay)
# Fit the full model 
data<-na.omit(ImpAndHRMNoMissingFinal)
full.model <- glm(AcidReflux ~ ., data = data)
# Stepwise regression model
step.model <- stepAIC(full.model, direction = "both", 
                      trace = FALSE)
modelSum<-summary(step.model)
modelSum%>%xtable%>%kable

kable(logistic.display(step.model))%>%
   kable_styling(bootstrap_options = "striped", full_width = T, position = "float_right")
detach("package:epiDisplay", unload=TRUE)

detach("package:MASS", unload=TRUE)

library(stargazer)
library(Gmisc)
Gmisc::htmltable(stargazer(step.model,type="text"))
xtable(stargazer(step.model,type="text"))



library()
library(jtools)
library(pROC)
export_summs(step.model,scale = TRUE,error_format = "[{conf.low}, {conf.high}]")
plot_summs(step.model, scale = TRUE, plot.distributions = TRUE)
predpr <- predict(full.model,type=c("response"))
roccurve <- roc(AcidReflux ~ predpr, data = data)
plot(roccurve)
auc(roccurve)

A generalised linear multiple regression model was implemented in R (MASS package) using a stepwise bidirectional linear regression method. The model determined that the most important variables for the prediction of a positive pH impedance test were the following (Table 1). The area under the curve for the model was r auc(roccurve)

#### Documenting the sets 

# nodes <- create_node_df(n=6, 
#                         nodes=c("ImpAll", "AllBravo", "NegativeImpedance","MyBravoFromNegativeImpedance","MyBravoFromNegativeImpedancePOSITIVE_BRAVO","MyBravoFromNegativeImpedanceNEGATTIVE_BRAVO"),
#                         label=c(paste0("ImpAll: ",nrow(ImpAll)), stringr::str_wrap(paste0("All BRAVO: ",nrow(AllBravo)),10), stringr::str_wrap(paste0("Negative Impedance: ",nrow(NegativeImpedance)),10),
#                                 stringr::str_wrap(paste0("MyBravo (Negative Impedance):",nrow(MyBravoFromNegativeImpedance)),10),
#                                 stringr::str_wrap(paste0("Positive BRAVO:",nrow(MyBravoFromNegativeImpedancePOSITIVE_BRAVO)),10),
#                                 stringr::str_wrap(paste0("Negative BRAVO: ",nrow(MyBravoFromNegativeImpedanceNEGATTIVE_BRAVO)),10)),
#                         shape = "rectangle",
#                         fontsize=5)
# 
# edges <-
#   create_edge_df(
#     from = c(1,2,3,4,4),
#     to = c(3,4,4,5,6))
# 
# 
# g <- create_graph(nodes_df=nodes, 
#                   edges_df=edges)%>%
#    add_global_graph_attrs(
#       attr = c("layout", "rankdir", "splines"),
#       value = c("dot", "TB", "false"),
#       attr_type = c("graph", "graph", "graph"))
# render_graph(g)
####Documenting the code: 

# library(CodeDepends)
# sc = readScript("/home/rstudio/PhysiMineR/Questions/ImpedanceQues/NegImpBRAVO/NegativeImpedanceBRAVO.Rmd")
# g = makeVariableGraph( info =getInputs(sc))
# if(require(Rgraphviz))
#   edgemode(g) <- "directed"
# x <- layoutGraph(g, layoutType="neato")
# zz = layoutGraph(g)
# graph.par(list(nodes = list(fontsize = 100)))
# renderGraph(zz)

Results

$\color{red}{\text{I.Define the Statistical methods}}$.

eg Clinical isolates of MRSP cases (n = 150) and methicillin-susceptible S. pseudintermedius (MSSP) controls (n = 133) and their corresponding host signalment and medical data covering the six months prior to staphylococcal isolation were analysed by multivariable logistic regression.

The identity of all MRSP isolates was confirmed through demonstration of S. intermedius-group specific nuc and mecA.

$\color{red}{\text{I.Describe demographics}}$. The mean age was 57.3 ± 17 years, and 62.3% of the subjects were male. The age-adjusted incidence rates were 13.8 (non-lobar) and 4.9 (lobar) per 100,000 person-years. $\color{red}{\text{J.Then describe level one results}}$.

$\color{red}{\text{K.Then describe subset results}}$.

eg In the final model, cats (compared to dogs, OR 18.5, 95% CI 1.8–188.0, P = 0.01), animals that had been hospitalised (OR 104.4, 95% CI 21.3–511.6, P < 0.001), or visited veterinary clinics more frequently (>10 visits OR 7.3, 95% CI 1.0–52.6, P = 0.049) and those that had received topical ear medication (OR 5.1, 95% CI 1.8– 14.9, P = 0.003) or glucocorticoids (OR 22.5, 95% CI 7.0–72.6, P < 0.001) were at higher risk of MRSP infection, whereas S. pseudintermedius isolates from ears were more likely to belong to the MSSP group (OR 0.09, 95% CI 0.03–0.34, P < 0.001).

Choose your statistical analysis

Discussion

$\color{red}{\text{L.Sentence 1: To be decided}}$.

$\color{red}{\text{M.Sentence 2: To be decided}}$.

$\color{red}{\text{N.Sentence 3: To be decided}}$. eg These results indicate an association of MRSP infection with veterinary clinic/hospital settings and possibly with chronic skin disease.

Limitations

$\color{red}{\text{O.Sentence 1: To be decided}}$. eg There was an unexpected lack of association between MRSP and antimicrobial therapy; this requires further investigation .(Lehner et al., 2014).



sebastiz/PhysiMineR documentation built on Oct. 3, 2023, 3:46 p.m.