clean_baseline | R Documentation |
The cleaning is based on fitting the complete time series using regression methods (by default Poisson regression, but any other glm family is accepted, extended to negative binomial using the package fitdistrplus), and then removing any observations that fall outside a given confidence interval (set by the user). These observations are substituted by the model prediction for that time point.
clean_baseline(x, ...) ## S4 method for signature 'syndromicD' clean_baseline(x, syndromes = NULL, family = "poisson", limit = 0.95, formula = NULL, frequency = 365, plot = TRUE, print.model = TRUE) ## S4 method for signature 'syndromicW' clean_baseline(x, syndromes = NULL, family = "poisson", limit = 0.95, formula = "year+sin+cos", plot = TRUE, print.model = TRUE, frequency = 52)
x |
a syndromic ( |
... |
Additional arguments to the method. |
syndromes |
an optional parameter, if not specified, all columns in the slot observed of the syndromic object will be used. The user can choose to restrict the analyses to a few syndromic groups listing their name or column position in the observed matrix. See examples. |
family |
the GLM distribution family used, by default "poisson". if "nbinom" is used, the function glm.nb is used instead. |
limit |
the confidence interval to be used in identifying outliers. |
formula |
the regression formula to be used, in the R formula format: y~x1+x2...
If none is provided, the function looks for formulas in the @formula slot of the syndromic object.
If a formula is provided when this function is called, then that formula is used. We
recommend providing a formula to test various models, but once a model is chosen,
we recommend saving that formula in the syndromic object using:
my.syndromic@formula <- list(formula1,formula2...), for as many syndromes as the syndromic object has (columns in observed).
NA can be provided when a syndrome is not to be associated with a particular formula.
Any variables (x1, x2...) must be given the same name they have in the
slot @dates. When providing a formula, two options are possible: providing a single formula to be applied to all
syndromes, or providing the same number of formulas (in a list) as the number of syndromes in the observed object,
even if not of them will be used (see examples!)
The variables that are standard in that slot for DAILY data ( |
plot |
whether plots comparing observed data and the result of the cleaning process should be displayed. |
print.model |
whether the result of model fitting should be printed on the console. This is recommended when the user is exploring which dependent variables to keep or drop. |
An object of the class syndromic (syndromicD
or syndromicW
)
which contains all
elements from the object provided in x, but in which
the slot baseline has been filled with an outbreak-free baseline
for each syndromic group. When the user chooses to restrict analyses to some
syndromes, the remaining columns are kept as is (if the slot was not empty)
or filled with NAs when previously empty.
Fernanda C. Dorea, Crawford W. Revie, Beverly J. McEwen, W. Bruce McNab, David Kelton, Javier Sanchez (2012). Retrospective time series analysis of veterinary laboratory data: Preparing a historical baseline for cluster detection in syndromic surveillance. Preventive Veterinary Medicine. DOI: 10.1016/j.prevetmed.2012.10.010.
## Examples for 'syndromicD' data(lab.daily) my.syndromicD <- raw_to_syndromicD (id=SubmissionID, syndromes.var=Syndrome, dates.var=DateofSubmission, date.format="%d/%m/%Y", remove.dow=c(6,0), add.to=c(2,1), data=lab.daily) my.syndromicD <- clean_baseline(my.syndromicD, formula=list(days~dow+month+year), frequency=260) my.syndromicD <- clean_baseline(my.syndromicD, formula=list(days~dow+month+year), frequency=260) my.syndromicD <- clean_baseline(my.syndromicD, formula=list(days~dow+month+year), frequency=260) my.syndromicD <- clean_baseline(my.syndromicD, syndromes="Musculoskeletal", formula=list(days~dow+month+year), frequency=260) my.syndromicD <- clean_baseline(my.syndromicD, syndromes=c("GIT","Musculoskeletal"), formula=list(NA,y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7, days~dow+month,NA,NA), frequency=260) my.syndromicD <- clean_baseline(my.syndromicD, syndromes=3, formula=list(NA,y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7, days~dow+month,NA,NA), frequency=260) my.syndromicD <- clean_baseline(my.syndromicD, syndromes=c(2,3), formula=list(NA,y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7, days~dow+month,NA,NA)), frequency=260) my.syndromicD <- clean_baseline(my.syndromicD, family="nbinom", formula=list(days~dow+month), frequency=260) my.syndromicD <- clean_baseline(my.syndromicD, syndromes="Musculoskeletal", family="nbinom", formula=list(y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7), frequency=260) my.syndromicD <- clean_baseline(my.syndromicD, syndromes=c("GIT","Musculoskeletal"), family="nbinom", formula=list(NA,y~dow+sin+cos+year+AR1+AR2+AR3+AR4+AR5+AR6+AR7, days~dow+month,NA,NA), frequency=260) my.syndromicD <- clean_baseline(my.syndromicD, syndromes=3, family="nbinom", formula=list(days~dow+month), frequency=260) my.syndromicD <- clean_baseline(my.syndromicD, syndromes=c(2,3), family="nbinom", formula=list(days~dow+month), frequency=260) ## Examples for 'syndromicW' data(lab.daily) my.syndromicW <- raw_to_syndromicW (id=SubmissionID, syndromes.var=Syndrome, dates.var=DateofSubmission, date.format="%d/%m/%Y", formula=list(NA,y~year,weeks~trend+sin+cos,NA,NA) data=lab.daily) my.syndromicW <- clean_baseline(my.syndromicW,formula=list(NA,y~year,weeks~trend+sin+cos,NA,NA)) my.syndromicW <- clean_baseline(my.syndromicW, formula=list(week~sin+cos)) my.syndromicW <- clean_baseline(my.syndromicW, syndromes="Musculoskeletal", formula=list(week~sin+cos)) my.syndromicW <- clean_baseline(my.syndromicW, syndromes=c("GIT","Musculoskeletal"), formula=list(week~sin+cos)) my.syndromicW <- clean_baseline(my.syndromicW, syndromes=3, formula=list(week~sin+cos)) my.syndromicW <- clean_baseline(my.syndromicW, syndromes=c(1,3), formula=list(NA,y~year,weeks~trend+sin+cos,NA,NA)) my.syndromicW <- clean_baseline(my.syndromicW, family="nbinom", formula=list(week~sin+cos)) my.syndromicW <- clean_baseline(my.syndromicW, syndromes="Musculoskeletal",family="nbinom", formula=list(week~sin+cos))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.