pre_process_glm | R Documentation |
pre_process_glm
Function to remove known temporal effects from time series. It fits a glm model to the time series, and delivers the residuals.
pre_process_glm(x, ...) ## S4 method for signature 'syndromicD' pre_process_glm(x, slot = "observed", syndromes = NULL, family = "poisson", formula = NULL, frequency = 365, print.model = TRUE, plot = TRUE) ## S4 method for signature 'syndromicW' pre_process_glm(x, slot = "observed", syndromes = NULL, family = "poisson", formula = NULL, frequency = 52, print.model = TRUE, plot = TRUE)
x |
a syndromic ( |
... |
Additional arguments to the method. |
slot |
the slot in the |
syndromes |
an optional parameter, if not specified, all
columns in the slot observed (or baseline if that
was chosen in the previous parameter) of the |
family |
the GLM distribution family used, by default "poisson". if "nbinom" is used, the function glm.nb is used instead. |
formula |
the regression formula to be used, in the R formula format: y~x1+x2...
If none is provided, the function looks for formulas in the @formula slot of the syndromic object.
If a formula is provided when this function is called, then that formula is used. We
recommend providing a formula to test various models, but once a model is chosen,
we recommend saving that formula in the syndromic object using:
my.syndromic@formula <- list(formula1,formula2...), for as many syndromes as the syndromic object has (columns in observed).
NA can be provided when a syndrome is not to be associated with a particular formula.
Any variables (x1, x2...) must be given the same name they have in the
slot @dates. When providing a formula, two options are possible: providing a single formula to be applied to all
syndromes, or providing the same number of formulas (in a list) as the number of syndromes in the observed object,
even if not of them will be used (see examples!)
The variables that are standard in that slot for DAILY data ( |
frequency |
in case pre-processing is applied using "glm" AND the sin/cos functions are used, the cycle of repetitions need to be set. The default is a year (365 days or 52 weeks). |
print.model |
whether the result of model fitting should be printed on the console. This is recommended when the user is exploring which dependent variables to keep or drop. |
plot |
whether plots comparing observed data and the result of the pre-processing should be displayed. |
This function is provided for users interested in capturing
(saving or plotting) the results of this pre-processing step.
However, in the context of syndromic
surveillance through objects of the class syndromic (syndromicD
or syndromicW
),
pre-processing is performed in conjunction with the application of
control-charts, saving results into an object of the
class syndromic
(D or W) (within
detection algorithms. - See ewma_synd(), shew_synd() and cusum_synd())
A matrix with all the pre-processed vectors.
Fernanda C. Dorea, Crawford W. Revie, Beverly J. McEwen, W. Bruce McNab, David Kelton, Javier Sanchez (2012). Retrospective time series analysis of veterinary laboratory data: Preparing a historical baseline for cluster detection in syndromic surveillance. Preventive Veterinary Medicine. DOI: 10.1016/j.prevetmed.2012.10.010.
## DAILY data(lab.daily) my.syndromicD <- raw_to_syndromicD (id=SubmissionID, syndromes.var=Syndrome, dates.var=DateofSubmission, date.format="%d/%m/%Y", remove.dow=c(6,0), add.to=c(2,1), data=lab.daily) pre_processed_data <- pre_process_glm(my.syndromicD, syndromes="Musculoskeletal", formula=list(y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7)) pre_processed_data <- pre_process_glm(my.syndromicD, syndromes=c("GIT","Musculoskeletal"), formula=list(NA,y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7, days~dow+month,NA,NA)) pre_processed_data <- pre_process_glm(my.syndromicD, syndromes=c("GIT","Musculoskeletal"), formula=list(y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7)) pre_processed_data <- pre_process_glm(my.syndromicD, syndromes=3, formula=list(y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7)) pre_processed_data <- pre_process_glm(my.syndromicD, syndromes=c(2,3), formula=list(NA,y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7, days~dow+month,NA,NA)) ##WEEKLY data(lab.daily) my.syndromicW <- raw_to_syndromicW (id=SubmissionID, syndromes.var=Syndrome, dates.var=DateofSubmission, date.format="%d/%m/%Y", data=lab.daily) pre_processed_data <- pre_process_glm(my.syndromicW, syndromes="Musculoskeletal", formula=list(y~year)) pre_processed_data <- pre_process_glm(my.syndromicW, syndromes=c("GIT","Musculoskeletal"), formula=list(NA,y~year,weeks~trend+sin+cos,NA,NA)) pre_processed_data <- pre_process_glm(my.syndromicW, syndromes=3, formula=list(y~year)) pre_processed_data <- pre_process_glm(my.syndromicW, syndromes=c(1,3), formula=list(y~year,NA,weeks~trend+sin+cos,NA,NA))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.