pre_process_glm-methods: 'pre_process_glm'

pre_process_glmR Documentation

pre_process_glm

Description

Function to remove known temporal effects from time series. It fits a glm model to the time series, and delivers the residuals.

Usage

pre_process_glm(x, ...)

## S4 method for signature 'syndromicD'
pre_process_glm(x, slot = "observed",
  syndromes = NULL, family = "poisson", formula = NULL, frequency = 365,
  print.model = TRUE, plot = TRUE)

## S4 method for signature 'syndromicW'
pre_process_glm(x, slot = "observed",
  syndromes = NULL, family = "poisson", formula = NULL, frequency = 52,
  print.model = TRUE, plot = TRUE)

Arguments

x

a syndromic (syndromicD or syndromicW) object, which must have at least the slot of observed data and a data frame in the slot dates.

...

Additional arguments to the method.

slot

the slot in the syndromic object to be processed, by default, "observed", but this argument can be used to change it to "baseline"

syndromes

an optional parameter, if not specified, all columns in the slot observed (or baseline if that was chosen in the previous parameter) of the syndromic object will be used. The user can choose to restrict the analyses to a few syndromic groups listing their name or column position in the observed matrix. See examples.

family

the GLM distribution family used, by default "poisson". if "nbinom" is used, the function glm.nb is used instead.

formula

the regression formula to be used, in the R formula format: y~x1+x2... If none is provided, the function looks for formulas in the @formula slot of the syndromic object. If a formula is provided when this function is called, then that formula is used. We recommend providing a formula to test various models, but once a model is chosen, we recommend saving that formula in the syndromic object using: my.syndromic@formula <- list(formula1,formula2...), for as many syndromes as the syndromic object has (columns in observed). NA can be provided when a syndrome is not to be associated with a particular formula. Any variables (x1, x2...) must be given the same name they have in the slot @dates. When providing a formula, two options are possible: providing a single formula to be applied to all syndromes, or providing the same number of formulas (in a list) as the number of syndromes in the observed object, even if not of them will be used (see examples!) The variables that are standard in that slot for DAILY data (syndromicD) are: trend (for a monotonic trend), year, month, dow (day of week), sin, cos, Ar1 (auto-regressive for 1 days) to AR7. For WEEKLY data (syndromicW): trend, sin, cos, year, and 1 to 4 autoregressive variables. These elements can be combined into any formula. Since the @dates slot can be customized by the user, any variables in the dates data.frame can be called into the formula

frequency

in case pre-processing is applied using "glm" AND the sin/cos functions are used, the cycle of repetitions need to be set. The default is a year (365 days or 52 weeks).

print.model

whether the result of model fitting should be printed on the console. This is recommended when the user is exploring which dependent variables to keep or drop.

plot

whether plots comparing observed data and the result of the pre-processing should be displayed.

Details

This function is provided for users interested in capturing (saving or plotting) the results of this pre-processing step. However, in the context of syndromic surveillance through objects of the class syndromic (syndromicD or syndromicW), pre-processing is performed in conjunction with the application of control-charts, saving results into an object of the class syndromic (D or W) (within detection algorithms. - See ewma_synd(), shew_synd() and cusum_synd())

Value

A matrix with all the pre-processed vectors.

References

Fernanda C. Dorea, Crawford W. Revie, Beverly J. McEwen, W. Bruce McNab, David Kelton, Javier Sanchez (2012). Retrospective time series analysis of veterinary laboratory data: Preparing a historical baseline for cluster detection in syndromic surveillance. Preventive Veterinary Medicine. DOI: 10.1016/j.prevetmed.2012.10.010.

Examples

## DAILY
data(lab.daily)
my.syndromicD <- raw_to_syndromicD (id=SubmissionID,
                                 syndromes.var=Syndrome,
                                 dates.var=DateofSubmission,
                                 date.format="%d/%m/%Y",
                                 remove.dow=c(6,0),
                                 add.to=c(2,1),
                                 data=lab.daily)
pre_processed_data <- pre_process_glm(my.syndromicD,
                              syndromes="Musculoskeletal",
                                     formula=list(y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7))
pre_processed_data <- pre_process_glm(my.syndromicD,
                              syndromes=c("GIT","Musculoskeletal"),
                              formula=list(NA,y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7,
                              days~dow+month,NA,NA))
pre_processed_data <- pre_process_glm(my.syndromicD,
                              syndromes=c("GIT","Musculoskeletal"),
                              formula=list(y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7))
                              
pre_processed_data <- pre_process_glm(my.syndromicD,
                              syndromes=3,
                                     formula=list(y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7))
pre_processed_data <- pre_process_glm(my.syndromicD,
                              syndromes=c(2,3),
                              formula=list(NA,y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7,
                              days~dow+month,NA,NA))
##WEEKLY
data(lab.daily)
my.syndromicW <- raw_to_syndromicW (id=SubmissionID,
                                 syndromes.var=Syndrome,
                                 dates.var=DateofSubmission,
                                 date.format="%d/%m/%Y",
                                 data=lab.daily)
pre_processed_data <- pre_process_glm(my.syndromicW,
                              syndromes="Musculoskeletal",
                              formula=list(y~year))
pre_processed_data <- pre_process_glm(my.syndromicW,
                              syndromes=c("GIT","Musculoskeletal"),
                              formula=list(NA,y~year,weeks~trend+sin+cos,NA,NA))
pre_processed_data <- pre_process_glm(my.syndromicW,
                              syndromes=3,
                              formula=list(y~year))
pre_processed_data <- pre_process_glm(my.syndromicW,
                              syndromes=c(1,3),
                              formula=list(y~year,NA,weeks~trend+sin+cos,NA,NA))


nandadorea/vetsyn documentation built on April 30, 2022, 1:15 a.m.