Standardizing Medical Devices Surveillance Data"

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
load(file="../data/maude.rda")
load(file="../data/sales.rda")
library(mds)

Why?

Medical device event data are messy.

Common challenges include:

How?

The mds package provides a standardized framework to address these challenges:

Purpose of This Vignette

Note on Statistical Algorithms

mds data and analysis standards allow for seamless application of various statistical trending algorithms via the mdsstat package (under development).

Data: MAUDE and Simulated Sales

Our example dataset maude was queried from the FDA MAUDE API and contains 535 reported events on bone cement in 2017. Furthermore, a simulated exposure dataset sales was generated to provide denominator data for our bone cement events.

library(mds)
dim(maude)
dim(sales)
head(maude, 3)
knitr::kable(head(maude, 3))
head(sales, 3)
knitr::kable(head(sales, 3))

Raw Data to Trending in 4 Steps {#foursteps}

The general workflow to go from data to trending over time is as follows:

  1. Use deviceevent() to standardize device-event data.
  2. Use exposure() to standardize exposure data (optional).
  3. Use define_analyses() to enumerate possible analysis combinations.
  4. Use time_series() to generate counts (and/or rates) by time based on your defined analyses.

Live Example

# Step 1 - Device Events
de <- deviceevent(
  maude,
  time="date_received",
  device_hierarchy=c("device_name", "device_class"),
  event_hierarchy=c("event_type", "medical_specialty_description"),
  key="report_number",
  covariates="region",
  descriptors="_all_")

# Step 2 - Exposures (Optional step)
ex <- exposure(
  sales,
  time="sales_month",
  device_hierarchy="device_name",
  match_levels="region",
  count="sales_volume")

# Step 3 - Define Analyses
da <- define_analyses(
  de,
  device_level="device_name",
  exposure=ex,
  covariates="region")

# Step 4 - Time Series
ts <- time_series(
  da,
  deviceevents=de,
  exposure=ex)

What to Do Next

You may:

Summarize Defined Analyses

summary(da)

Show All Analyses as a Data Frame

dadf <- define_analyses_dataframe(da)
head(dadf, 3)
knitr::kable(head(dadf, 3))

Plot Time Series of Counts and Rates {#quickplot}

plot(ts[[1]])
plot(ts[[4]], "rate", type='l')

deviceevent() to Standardize Device-Event Data {#de}

Basic Usage

de <- deviceevent(maude, "date_received", c("device_name", "device_class"), c("event_type", "medical_specialty_description"))
head(de, 3)
knitr::kable(head(de, 3))

Advanced Usage

de <- deviceevent(
  maude,
  time="date_received",
  device_hierarchy=c("device_name", "device_class"),
  event_hierarchy=c("event_type", "medical_specialty_description"),
  key="report_number",
  covariates="region",
  descriptors="_all_")
head(de, 3)
knitr::kable(head(de, 3))

Required Arguments

data_frame : is the input device-event data frame. All remaining arguments refer to variables within this data frame.

time : is the name of the variable containing the time of the event. The input format of the time will be flexibly converted into Date format.

device_hierarchy : organizes all your device variables into a hierarchy. The hierarchical concept reflects how devices are often nested into progressively more general groups. Set the first variable as the lowest device level that you would like to trend at. mds remembers this hierarchy and allows trending at multiple levels as you specify.

event_hierarchy : organizes all your event variables into a hierarchy. Like devices, event variables should be categorical in nature. Free text descriptions should not be listed here, but rather in the descriptors argument. The hierarchical concept reflects how events are often nested into progressively more general groups. Set the first variable as the lowest event level that you would like to trend at. mds remembers this hierarchy and allows trending at multiple levels as you specify. If your data does not have an event variable, you will need to create a dummy variable.

Optional Arguments

key : is a unique identifier for each unique event in data_frame. If your data pipeline carries over a key variable, it is recommended to specify it here. The key allows downstream aggregated analysis to be able to "look up" individual constituent events.

covariates : are a special group of variables that may be analyzed within device. For instance, declaring covariates="Region" will allow analysis of regions within device. These variables should be categorical in nature.

descriptors : are additional variables that should be retained for the purpose of describing individual events in downstream analysis.

implant_days : contains the age in days of an implantable device at the time of the event.

exposure() to Standardize Exposure Data {#ex}

Exposure data is meant to support device-event data. As such, the general expectation is that variable values match between exposure and device-event data. For example, 10 exposures for ev3 Solitaire in France will be matched exactly to ev3 Solitaire events in France, and not to events for EV3 SOLITAIRE in FRANCE.

Basic Usage

ex <- exposure(sales, "sales_month", "device_name")
head(ex, 3)
knitr::kable(head(ex, 3))

Advanced Usage

ex <- exposure(
  sales,
  time="sales_month",
  device_hierarchy="device_name",
  match_levels="region",
  count="sales_volume")
head(ex, 3)
knitr::kable(head(ex, 3))

Required Arguments

Note: Although not required, count will commonly be used as well.

data_frame : is the input exposure data frame. All remaining arguments refer to variables within this data frame.

time : is the name of the variable containing the time of the exposure. The input format of the time will be flexibly converted into Date format. If exposure will be used, it is critical to have sufficient time granularity. For example, if analysis will be done monthly, exposure data must be no less granular than monthly. mds does not make assumptions about filling in holes in time!

device_hierarchy : contains all exposure device variables to match to your device-event data. As such, the values within these variables must match exactly to the values within each respective variable in the device-event data device_hierarchy parameter.

event_hierarchy : contains all exposure event variables to match to your device-event data. As such, the values within these variables must match exactly to the values within each respective variable in the device-event data event_hierarchy parameter. Exposures at an event level is not common.

Optional Arguments

count : is the most commonly specified optional parameter. It contains the number of exposures. If not specified, the number of rows will be used as a proxy for count.

key : is a unique identifier for each unique exposure in data_frame. If your data pipeline carries over a key variable, it is recommended to specify it here. The key allows downstream aggregated analysis to be able to "look up" individual constituent exposure records.

match_levels : are variables aside from time, device, and event that specify an exposure. A common "match level" is country, if your exposure data is specific by country.

define_analyses() to Enumerate Analysis Combinations {#da}

After standardizing device-event data using deviceevent() and, optionally, exposure data using exposure(), the next step is to discover what types of analyses are possible. This is separated from actually doing the analysis (counting, calculations, statistics, etc.) because:

Basic Usage

da <- define_analyses(de, "device_name")

Note that define_analyses() returns a list of individual analyses. Each individual analysis contains a set of instructions. You can view an analysis by submitting da[[1]], da[[2]], etc., but a less cumbersome overview is possible using summary() and define_analyses_dataframe().

r options(warn=-1)

summary(da)

r options(warn=0)

head(define_analyses_dataframe(da), 3)
knitr::kable(head(define_analyses_dataframe(da), 3))

Advanced Usage

da <- define_analyses(
  de,
  device_level="device_name",
  exposure=ex,
  covariates="region")
summary(da)
head(define_analyses_dataframe(da), 3)
knitr::kable(head(define_analyses_dataframe(da), 3))

Required Arguments

deviceevents : is a standardized device-event data frame. (class() should contain "mds_de")

device_level : is the source device variable to analyze by. Access these variable names by submitting attributes(de)$device_hierarchy.

Optional Arguments

event_level : is the source event variable to analyze by. Access these variable names by submitting attributes(de)$event_hierarchy.

exposure : is a standardized exposure data frame. (class() should contain "mde_e")

date_level and date_level_n : are arguments used together to specify the time interval to analyze by. Default of "months" and 1 analyzes by month. Other examples include "months" and 12 for yearly, or "days" and 7 for weekly.

covariates : are variables to analyze within device. For example, c("region") analyzes by each level of region within device.

times_to_calc : specifies how many time periods (counting back in time) to analyze for. Time period is defined using date_level and date_level_n.

What About Analyses Across All Events, All Devices, etc?

It is always assumed that analyses at aggregated levels are desired. (such as analysis of all events for a given device, or analysis of all events across all devices)

Aggregated level analysis is easily recognized by the "All" and "Data" values in device_level, event_level, covariate, and covariate_level.

knitr::kable(define_analyses_dataframe(da)[c(11:12, 32), ])

How to Customize Analyses

There are several options:

time_series() to Generate Counts, Rates, and More {#ts}

Once an analysis has been defined using define_analyses(), the analyses instructions can be executed using time_series(), returning by defined time periods:

Basic Usage

ts <- time_series(da, de)

Note that time_series() returns, in a list, one time series data frame for every analysis. You can select a time series by submitting ts[[1]], ts[[2]], etc.

head(ts[[1]], 3)
knitr::kable(head(ts[[1]], 3))

Advanced Usage

ts <- time_series(
  da,
  deviceevents=de,
  exposure=ex)
head(ts[[1]], 3)
knitr::kable(head(ts[[1]], 3))

Required Arguments

analysis : is a single defined analysis (class() should contain "mds_da") or a list of defined analysis.

deviceevents : is a standardized device-event data frame (class() contains "mds_de"). It is typically the same data frame used to generate analysis, but can be another "mds_de" data frame, such as a cut of the data at a different time. Note if, say, an older dataset is being used, the analysis date ranges must correspond.

Optional Arguments

exposure : is a standardized exposure data frame (class() contains "mds_e"). It is typically the same data frame used to generate analysis. Like deviceevents, another data frame may be used, but the analysis instructions must correspond.

use_hierarchy : is a logical value for whether device and event hierarchies should be used for the calculation of disproportionality analysis (DPA) counts. Submit ?time_series.mds_da for more details.

How to Modify Counts & Exposures

It is not uncommon to adjust event and exposure counts, such as with applications of rolling or moving averages. These adjustments should be applied after generating time series data frames from time_series().

plot()ing a Time Series {#po}

Plotting an individual time series generated by time_series() is simple. Simply call plot() on the time series object:

plot(ts[[1]])

There are a few custom parameters, including:

mode : with common values of "nA" (representing the device-event of interest), "exposure", and "rate" (simply "nA"/"exposure"). Less common are "nB", "nC", and "nD" representing the cell counts of the disproportionality analysis (DPA) contingency table.

xlab, ylab, main : representing default plot() behavior. By default, axes and title labels are inferred directly from the time series.

All other parameters are from plot.default().



Try the mds package in your browser

Any scripts or data that you put into this service are public.

mds documentation built on July 1, 2020, 10:38 p.m.