pre_process_glm-methods: 'pre_process_glm'
In nandadorea/vetsyn: Tools for Syndromic Surveillance Implementation

pre_process_glm

R Documentation

`pre_process_glm`

Description

Function to remove known temporal effects from time series. It fits a glm model to the time series, and delivers the residuals.

Usage

pre_process_glm(x, ...)

## S4 method for signature 'syndromicD'
pre_process_glm(x, slot = "observed",
  syndromes = NULL, family = "poisson", formula = NULL, frequency = 365,
  print.model = TRUE, plot = TRUE)

## S4 method for signature 'syndromicW'
pre_process_glm(x, slot = "observed",
  syndromes = NULL, family = "poisson", formula = NULL, frequency = 52,
  print.model = TRUE, plot = TRUE)

Arguments

`x`	a syndromic (`syndromicD` or `syndromicW`) object, which must have at least the slot of observed data and a data frame in the slot dates.
`...`	Additional arguments to the method.
`slot`	the slot in the `syndromic` object to be processed, by default, "observed", but this argument can be used to change it to "baseline"
`syndromes`	an optional parameter, if not specified, all columns in the slot observed (or baseline if that was chosen in the previous parameter) of the `syndromic` object will be used. The user can choose to restrict the analyses to a few syndromic groups listing their name or column position in the observed matrix. See examples.
`family`	the GLM distribution family used, by default "poisson". if "nbinom" is used, the function glm.nb is used instead.
`formula`	the regression formula to be used, in the R formula format: y~x1+x2... If none is provided, the function looks for formulas in the @formula slot of the syndromic object. If a formula is provided when this function is called, then that formula is used. We recommend providing a formula to test various models, but once a model is chosen, we recommend saving that formula in the syndromic object using: my.syndromic@formula <- list(formula1,formula2...), for as many syndromes as the syndromic object has (columns in observed). NA can be provided when a syndrome is not to be associated with a particular formula. Any variables (x1, x2...) must be given the same name they have in the slot @dates. When providing a formula, two options are possible: providing a single formula to be applied to all syndromes, or providing the same number of formulas (in a list) as the number of syndromes in the observed object, even if not of them will be used (see examples!) The variables that are standard in that slot for DAILY data (`syndromicD`) are: trend (for a monotonic trend), year, month, dow (day of week), sin, cos, Ar1 (auto-regressive for 1 days) to AR7. For WEEKLY data (`syndromicW`): trend, sin, cos, year, and 1 to 4 autoregressive variables. These elements can be combined into any formula. Since the @dates slot can be customized by the user, any variables in the dates data.frame can be called into the formula
`frequency`	in case pre-processing is applied using "glm" AND the sin/cos functions are used, the cycle of repetitions need to be set. The default is a year (365 days or 52 weeks).
`print.model`	whether the result of model fitting should be printed on the console. This is recommended when the user is exploring which dependent variables to keep or drop.
`plot`	whether plots comparing observed data and the result of the pre-processing should be displayed.

Details

This function is provided for users interested in capturing (saving or plotting) the results of this pre-processing step. However, in the context of syndromic surveillance through objects of the class syndromic (syndromicD or syndromicW), pre-processing is performed in conjunction with the application of control-charts, saving results into an object of the class syndromic (D or W) (within detection algorithms. - See ewma_synd(), shew_synd() and cusum_synd())

Value

A matrix with all the pre-processed vectors.

References

Fernanda C. Dorea, Crawford W. Revie, Beverly J. McEwen, W. Bruce McNab, David Kelton, Javier Sanchez (2012). Retrospective time series analysis of veterinary laboratory data: Preparing a historical baseline for cluster detection in syndromic surveillance. Preventive Veterinary Medicine. DOI: 10.1016/j.prevetmed.2012.10.010.

Examples

## DAILY
data(lab.daily)
my.syndromicD <- raw_to_syndromicD (id=SubmissionID,
                                 syndromes.var=Syndrome,
                                 dates.var=DateofSubmission,
                                 date.format="%d/%m/%Y",
                                 remove.dow=c(6,0),
                                 add.to=c(2,1),
                                 data=lab.daily)
pre_processed_data <- pre_process_glm(my.syndromicD,
                              syndromes="Musculoskeletal",
                                     formula=list(y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7))
pre_processed_data <- pre_process_glm(my.syndromicD,
                              syndromes=c("GIT","Musculoskeletal"),
                              formula=list(NA,y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7,
                              days~dow+month,NA,NA))
pre_processed_data <- pre_process_glm(my.syndromicD,
                              syndromes=c("GIT","Musculoskeletal"),
                              formula=list(y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7))
                              
pre_processed_data <- pre_process_glm(my.syndromicD,
                              syndromes=3,
                                     formula=list(y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7))
pre_processed_data <- pre_process_glm(my.syndromicD,
                              syndromes=c(2,3),
                              formula=list(NA,y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7,
                              days~dow+month,NA,NA))
##WEEKLY
data(lab.daily)
my.syndromicW <- raw_to_syndromicW (id=SubmissionID,
                                 syndromes.var=Syndrome,
                                 dates.var=DateofSubmission,
                                 date.format="%d/%m/%Y",
                                 data=lab.daily)
pre_processed_data <- pre_process_glm(my.syndromicW,
                              syndromes="Musculoskeletal",
                              formula=list(y~year))
pre_processed_data <- pre_process_glm(my.syndromicW,
                              syndromes=c("GIT","Musculoskeletal"),
                              formula=list(NA,y~year,weeks~trend+sin+cos,NA,NA))
pre_processed_data <- pre_process_glm(my.syndromicW,
                              syndromes=3,
                              formula=list(y~year))
pre_processed_data <- pre_process_glm(my.syndromicW,
                              syndromes=c(1,3),
                              formula=list(y~year,NA,weeks~trend+sin+cos,NA,NA))

nandadorea/vetsyn documentation built on April 30, 2022, 1:15 a.m.