hdfail: Hard drive failure dataset

Description Usage Format Source Examples

Description

This dataset contains the observed follow-up times and SMART statistics of 52k unique hard drives.

Daily snapshots of a large backup storage provider over 2 years were made publicly available. On each day, the Self-Monitoring, Analysis, and Reporting Technology (SMART) statistics of operational drives are recorded. When a hard drive is no longer operational, it is marked as a failure and removed from the subsequent daily snapshots. New hard drives are also continuously added to the population. In total, there are over 52k unique hard drives over approximately 2 years and 2885 (5.5%) failures.

Usage

1
data("hdfail")

Format

A data frame with 52422 observations on the following 8 variables.

serial

unique serial number of the hard drive

model

hard drive model

time

the observed followup time

status

failure indicator

temp

temperature in Celsius

rsc

binary covariate, where 1 indicates sectors that encountered read, write, or verification errors

rer

binary covariate, where 1 indicates a non-zero rate of errors that occur in hardware when reading from data from disk.

psc

binary covariate, where 1 indicates there were sectors waiting to be remapped due to an unrecoverable error.

Source

https://www.backblaze.com/hard-drive-test-data.html

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## Not run: 
data(hdfail)

# Select only Western Digital hard drives
dat <- subset(hdfail, grepl("WDC", model))

fit.hd <- fitfrail(Surv(time, status) ~ temp + rer + rsc 
                                      + psc + cluster(model), 
                   dat, frailty="gamma", fitmethod="score")

fit.hd

## End(Not run)

frailtySurv documentation built on Aug. 29, 2018, 1:04 a.m.