glow500: GLOW500 data

Description Usage Format Source Examples

Description

glow500 dataset.

Usage

1

Format

A data.frame with 500 rows and 15 variables:

sub_id

Identification Code (1 - n)

site_id

Study Site (1 - 6)

phy_id

Physician ID code (128 unique codes)

priorfrac

History of Prior Fracture (1: No, 2: Yes)

age

Age at Enrollment (Years)

weight

Weight at enrollment (Kilograms)

height

Height at enrollment (Centimeters)

bmi

Body Mass Index (Kg/m^2)

premeno

Menopause before age 45 (1: No, 2: Yes)

momfrac

Mother had hip fracture (1: No, 2: Yes)

armassist

Arms are needed to stand from a chair (1: No, 2: Yes)

smoke

Former or current smoker (1: No, 2: Yes)

raterisk

Self-reported risk of fracture (1: Less than others of the same age, 2: Same as others of the same age, 3: Greater than others of the same age)

fracscore

Fracture Risk Score (Composite Risk Score)

fracture

Any fracture in first year (1: No, 2: Yes)

Source

Hosmer, D.W., Lemeshow, S. and Sturdivant, R.X. (2013) Applied Logistic Regression, 3rd ed., New York: Wiley

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
head(glow500, n = 10)
summary(glow500)

## Table 2.2 p. 39
summary(mod2.2 <- glm(fracture ~ age + weight + priorfrac +
                                 premeno + raterisk,
                      family = binomial,
                      data = glow500))

## Table 2.3 p. 40
summary(mod2.3 <- update(mod2.2, . ~ . - weight - premeno))

## Table 2.4 p. 44
vcov(mod2.3)

## Table 3.6 p. 58
contrasts(glow500$raterisk)

## Contrasts: Table 3.8 and 3.9 p. 60
contrasts(glow500$raterisk) <- matrix(c(-1,-1,1,0,0,1), byrow= TRUE, ncol = 2)
summary(mod3.9 <- glm(fracture ~ raterisk, family = binomial,
                      data = glow500))
# cleaning modified dataset ...
rm(glow500)

## Table 5.1 pg 160 - Hosmer-Lemeshow test (with vcdExtra package)
mod4.16 <- glm(fracture ~ age * priorfrac + height + momfrac * armassist +
                          I(as.integer(raterisk) == 3) ,
               family = binomial,
               data = glow500)
library(vcdExtra)
summary(HLtest(mod4.16))

## Table 5.3 p. 171 - Classification table
glow500$pred4.16 <- predict(mod4.16, type = "response")
with(glow500, addmargins(table( pred4.16 > 0.5, fracture)))

## Sensitivy, specificity, ROC (using pROC)
library(pROC)

## Figure 5.3 p. 177 - ROC curve (using pROC package)
print(roc4.16 <- roc(fracture ~ pred4.16, data = glow500))
plot(roc4.16, main = "Figure 5.3 p. 177")

## Table 5.8 p. 175
vars <- c("thresholds","sensitivities","specificities")
tab5.8 <- data.frame(roc4.16[vars])
## Now, for printing/comparison purposes, steps below in order to find
## threshold values most similar to those in the table
findIndex <- function(x, y) which.min( (x-y)^2 )
cutPoints <- seq(0.05, 0.75, by = 0.05)
tableIndex <- mapply(findIndex, y = cutPoints,
                     MoreArgs = list(x = roc4.16$thresholds))
## And finally, let's print a reasonable approximation of table 5.8
writeLines("\nTable 5.8 p. 175\n")
tab5.8[tableIndex, ]

## Figure 5.1 p. 175
plot(specificities ~ thresholds, xlim = c(0, 1), type = "l",
     xlab = "Probabilty cutoff", ylab = "Sensitivity/specificity",
     ylim = c(0, 1), data = tab5.8, main = "Figure 5.1 p. 175")
with(tab5.8, lines(thresholds, sensitivities, col = "red"))
legend(x = 0.75, y = 0.55, legend = c("Sensitivity", "Specificity"),
       lty = 1, col = c("red","black"))
abline(h = c(0, 1), col = "grey80", lty = "dotted")

Example output

   sub_id site_id phy_id priorfrac age weight height      bmi premeno momfrac
1       1       1     14        No  62   70.3    158 28.16055      No      No
2       2       4    284        No  65   87.1    160 34.02344      No      No
3       3       6    305       Yes  88   50.8    157 20.60936      No     Yes
4       4       6    309        No  82   62.1    160 24.25781      No      No
5       5       1     37        No  61   68.0    152 29.43213      No      No
6       6       5    299       Yes  67   68.0    161 26.23356      No      No
7       7       5    302        No  84   50.8    150 22.57778      No      No
8       8       1     36       Yes  82   40.8    153 17.42919      No      No
9       9       1      8       Yes  86   62.6    156 25.72321      No      No
10     10       4    282        No  58   63.5    166 23.04398      No      No
   armassist smoke raterisk fracscore fracture
1         No    No     Same         1       No
2         No    No     Same         2       No
3        Yes    No     Less        11       No
4         No    No     Less         5       No
5         No    No     Same         1       No
6         No   Yes     Same         4       No
7         No    No     Less         6       No
8         No    No     Same         7       No
9         No    No     Same         7       No
10        No    No     Less         0       No
     sub_id         site_id          phy_id       priorfrac      age       
 Min.   :  1.0   Min.   :1.000   Min.   :  1.00   No :374   Min.   :55.00  
 1st Qu.:125.8   1st Qu.:2.000   1st Qu.: 57.75   Yes:126   1st Qu.:61.00  
 Median :250.5   Median :3.000   Median :182.50             Median :67.00  
 Mean   :250.5   Mean   :3.436   Mean   :178.55             Mean   :68.56  
 3rd Qu.:375.2   3rd Qu.:5.000   3rd Qu.:298.00             3rd Qu.:76.00  
 Max.   :500.0   Max.   :6.000   Max.   :325.00             Max.   :90.00  
     weight           height           bmi        premeno   momfrac   armassist
 Min.   : 39.90   Min.   :134.0   Min.   :14.88   No :403   No :435   No :312  
 1st Qu.: 59.90   1st Qu.:157.0   1st Qu.:23.27   Yes: 97   Yes: 65   Yes:188  
 Median : 68.00   Median :161.5   Median :26.42                                
 Mean   : 71.82   Mean   :161.4   Mean   :27.55                                
 3rd Qu.: 81.30   3rd Qu.:165.0   3rd Qu.:30.79                                
 Max.   :127.00   Max.   :199.0   Max.   :49.08                                
 smoke        raterisk     fracscore      fracture 
 No :465   Less   :167   Min.   : 0.000   No :375  
 Yes: 35   Same   :186   1st Qu.: 2.000   Yes:125  
           Greater:147   Median : 3.000            
                         Mean   : 3.698            
                         3rd Qu.: 5.000            
                         Max.   :11.000            

Call:
glm(formula = fracture ~ age + weight + priorfrac + premeno + 
    raterisk, family = binomial, data = glow500)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-1.42250  -0.74873  -0.59711  -0.06046   2.28830  

Coefficients:
                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)     -5.605794   1.220668  -4.592 4.38e-06 ***
age              0.050142   0.013417   3.737 0.000186 ***
weight           0.004080   0.006926   0.589 0.555803    
priorfracYes     0.679457   0.242384   2.803 0.005059 ** 
premenoYes       0.186958   0.276710   0.676 0.499266    
rateriskSame     0.534493   0.275858   1.938 0.052676 .  
rateriskGreater  0.874123   0.289157   3.023 0.002503 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 562.34  on 499  degrees of freedom
Residual deviance: 518.08  on 493  degrees of freedom
AIC: 532.08

Number of Fisher Scoring iterations: 4


Call:
glm(formula = fracture ~ age + priorfrac + raterisk, family = binomial, 
    data = glow500)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-1.42693  -0.73132  -0.60772  -0.07484   2.23819  

Coefficients:
                Estimate Std. Error z value Pr(>|z|)    
(Intercept)     -4.99060    0.90270  -5.529 3.23e-08 ***
age              0.04591    0.01244   3.690 0.000224 ***
priorfracYes     0.70023    0.24116   2.904 0.003689 ** 
rateriskSame     0.54856    0.27501   1.995 0.046075 *  
rateriskGreater  0.86576    0.28621   3.025 0.002487 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 562.34  on 499  degrees of freedom
Residual deviance: 518.90  on 495  degrees of freedom
AIC: 528.9

Number of Fisher Scoring iterations: 4

                (Intercept)           age  priorfracYes  rateriskSame
(Intercept)      0.81486728 -0.0108886686  0.0445010519 -0.0603876325
age             -0.01088867  0.0001547891 -0.0008344776  0.0002239259
priorfracYes     0.04450105 -0.0008344776  0.0581588491 -0.0031271822
rateriskSame    -0.06038763  0.0002239259 -0.0031271822  0.0756289050
rateriskGreater -0.08055123  0.0005370070 -0.0118441110  0.0462406814
                rateriskGreater
(Intercept)        -0.080551235
age                 0.000537007
priorfracYes       -0.011844111
rateriskSame        0.046240681
rateriskGreater     0.081913370
        Same Greater
Less       0       0
Same       1       0
Greater    0       1

Call:
glm(formula = fracture ~ raterisk, family = binomial, data = glow500)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-0.9005  -0.7726  -0.6058  -0.0838   1.8899  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -1.1172     0.1062 -10.514  < 2e-16 ***
raterisk1     0.0611     0.1437   0.425  0.67067    
raterisk2     0.4240     0.1466   2.892  0.00383 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 562.34  on 499  degrees of freedom
Residual deviance: 550.58  on 497  degrees of freedom
AIC: 556.58

Number of Fisher Scoring iterations: 4

Loading required package: vcd
Loading required package: MASS
Loading required package: grid
Loading required package: colorspace
Loading required package: gnm
Partition for Hosmer and Lemeshow Goodness-of-Fit Test 

               cut total obs      exp         chi
1  [0.0209,0.0847]    50  47 46.68709  0.04579536
2   (0.0847,0.111]    50  46 45.13980  0.12803215
3    (0.111,0.141]    50  43 43.72511 -0.10965679
4    (0.141,0.176]    51  40 42.92385 -0.44627785
5    (0.176,0.208]    49  42 39.60399  0.38073192
6    (0.208,0.249]    50  37 38.60188 -0.25782525
7    (0.249,0.322]    50  41 35.73499  0.88075020
8    (0.322,0.388]    50  31 32.37786 -0.24214783
9    (0.388,0.482]    50  25 28.18856 -0.60056317
10   (0.482,0.747]    50  23 22.01688  0.20952147
Hosmer and Lemeshow Goodness-of-Fit Test 

Call:
glm(formula = fracture ~ age * priorfrac + height + momfrac * 
    armassist + I(as.integer(raterisk) == 3), family = binomial, 
    data = glow500)
 ChiSquare df   P_value
  6.391925  8 0.6034186
       fracture
         No Yes Sum
  FALSE 356 103 459
  TRUE   19  22  41
  Sum   375 125 500
Type 'citation("pROC")' for a citation.

Attaching package: 'pROC'

The following object is masked from 'package:colorspace':

    coords

The following objects are masked from 'package:stats':

    cov, smooth, var


Call:
roc.formula(formula = fracture ~ pred4.16, data = glow500)

Data: pred4.16 in 375 controls (fracture No) < 125 cases (fracture Yes).
Area under the curve: 0.7286

Table 5.8 p. 175

    thresholds sensitivities specificities
6   0.05006609         1.000     0.0160000
64  0.09986050         0.952     0.1973333
136 0.15020771         0.848     0.3840000
201 0.20022684         0.768     0.5520000
263 0.24951185         0.640     0.6826667
294 0.30000935         0.624     0.7653333
335 0.34960972         0.488     0.8293333
366 0.40031190         0.392     0.8800000
393 0.44967455         0.296     0.9226667
417 0.50009629         0.176     0.9493333
434 0.54891768         0.096     0.9680000
447 0.59548830         0.032     0.9813333
453 0.65647052         0.024     0.9946667
456 0.70450302         0.000     0.9946667
457 0.74119791         0.000     0.9973333

aplore3 documentation built on May 2, 2019, 8:24 a.m.