raw_to_syndromicW-methods: raw_to_syndromicD
In nandadorea/vetsyn: Tools for Syndromic Surveillance Implementation

raw_to_syndromicD

R Documentation

raw_to_syndromicD

Description

raw_to_syndromicD

Usage

raw_to_syndromicD(id, syndromes.var, syndromes.name, dates.var,
  date.format = "%d/%m/%Y", min.date, max.date, remove.dow = FALSE,
  add.to = 0, sort = TRUE, data = NULL, formula = list())

Arguments

`id`	indicates a variable (or multiple variables) which should be used to identify unique events in the data. It can be provided as an R vector (p.e. mydata$myid), as the name of a DataFrame column (p.e. id=myid, data=my.data), or as multiple column names (p.e. id=list(id1,id2,id3), data=my.data).
`syndromes.var`	the variable that identifies group membership (in general the syndromic grouping). Can be `numeric`, `character` or `factor`.
`syndromes.name`	An optional argument providing the syndromic groups to be monitored. If not given, it is taken from the values found in `syndromes.var`. When syndromes.name IS provided, it should be provided as character value or vector (p.e. "Mastitis" or c("Mastitis","GIT") ).
`dates.var`	The vector (dates.var=mydata$mydates) or column name (dates.var=mydates, data=mydata) where the dates of the events are to be found.
`date.format`	The date.format of the date.variable. Default is d/m/Y. See strptime() for format specifications
`min.date`	An optional argument. If not provided, the minimum date found in the dataset is used.
`max.date`	An optional argument. If not provided, the maximum date found in the dataset is used.
`remove.dow`	An optional argument, by default set to FALSE. This allows the user to specify weekdays that must be removed from the dataset, for instance when weekends are not relevant. This must be se to integers between 0 and 6 specifying the day of the week to be removed. To remove saturdays and sundays, for instance, set remove.dow=c(6,0). (Note that in R days of week are counted from 0-Sunday to 6-Saturday)
`add.to`	when remove.dow is used, the user has the option to completely remove any counts assigned to the days of week to be removed (set add.to=0) or add them to the following or precedent day. For instance when removing weekends, the counts registered during weekends can be assigned to the following Monday or the preceding Friday, using add.to=1 or add.to=-1 respectively. Please note that: (i) the vector add.to must have the exact same dimensions as remove.dow. To remove weekends adding any observed counts to the following Monday the user would need to set remove.dow=c(6,0) and add.to=c(2,1) (Saturdays added to 2 days ahead, and Sunday to 1 day ahead)
`sort`	Default is true, which organizes the groups found in syndromes.name alphabetically. If set to FALSE, groups are listed in the order they are found in the dataset or provided in syndromes.name.
`data`	Optional argument. If used the other arguments can be specified as column names within the dataset provided through this argument
`formula`	A formula, or list of formulas, specifying the regression formula to be used when removing temporal patterns from each of the syndromes in @observed. For instance formula=list(y~dow+mon) for a single syndrome, where regression must take into account the variables dow (day-of-week) and month; or formula=c(y~dow, y~dow+mon) specifying two different formulas for two syndromes. The names of the variables given should exist in the columns of the slot @dates. Make sure that formulas' index match the columns in observed (for instance the second formula should correspond to the second syndrome, or second column in the observed matrix).You can provide NA for syndromes which should not be associated with any formula. This parameter is often only filled after some analysis in the data, not at the time of object creation.

Details

Create an object of the class syndromicD from raw, observed data. This assumed data will be monitored DAILY. For weekly monitoring please see rawD_to_syndromicW and rawW_to_syndromicW.

This functions will count the number of cases for one or more defined groups, daily. Days without counts will be assigned a count of zero, generating a complete sequence of dates. The complete sequence will start at the minimum date found in the dataset and end at the maximum day, by default. However it is also possible to provide a minimum date EARLIER than the minimum in the dataset or a maximum date LATER than the latest recorded. The extra days are assigned counts of zero (minimum or maximum dates already within the range of the dataset are ignored).

The raw, observed data, are assumed to be stored in a data.frame in which each observed event (for instance a laboratory submission) is recorded in one or multiple rows. Unique events can be identified by one unique ID. It is possible however to take into consideration an hierarchical organization of the data, by which an unique ID can only be verified taking into account multiple columns (p.e. animal ID is unique within farm, but not between farms, therefore the IDs are unique combinations of the variables "farm" and "animal").

Multiple events with the same unique ID are acceptable, but counted only once per time unit (p.e. day). Besides removing duplicated events, the function also completes missing days, assigning them a count of zero.

The function counts the number of events, per day, for each of the groups found in the variable syndromes.var. However, the variable syndromes.name can be used to RESTRICT the groups counted (if not all values appearing in the data are to be subjected to monitoring, p.e. when "nonspecific" or "non-classified" values exist); or to EXTEND the list to include values which did not appear in the dataset (this is the recommended use of this function for regular monitoring, in order to assure that groups with zero events in the specific data batch being analyzed will still be represented in the output of the function, though with zero counts every day.)

Value

an object of the class syndromic with the following slots: (1) OBSERVED: A matrix with as many columns as syndromic groups found in the dataset (or listed by the user); (2) DATES: A data frame where the first column contains the complete sequence of dates from the minimum to the maximum date found in the dataset (or set by the user), and additional columns contain additional date variables (such as day of week, holidays, month) as generated by default when an object of the class syndromic is created.

Examples

data(lab.daily)
my.syndromicD <- raw_to_syndromicD (id=lab.daily$SubmissionID,
                                  syndromes.var=lab.daily$Syndrome,
                                  dates.var=lab.daily$DateofSubmission,
                                  date.format="%d/%m/%Y")

my.syndromicD <- raw_to_syndromicD (id=SubmissionID,
                                  syndromes.var=Syndrome,
                                  dates.var=DateofSubmission,
                                  date.format="%d/%m/%Y",
                                  data=lab.daily)

my.syndromicD <- raw_to_syndromicD (id=list(HerdID,AnimalID),
                                  syndromes.var=Syndrome,
                                  dates.var=DateofSubmission,
                                  date.format="%d/%m/%Y",
                                  data=lab.daily)

my.syndromicD <- raw_to_syndromicD (id=SubmissionID,
                                  syndromes.var=Syndrome,
                                  syndromes.name=c("GIT","Musculoskeletal"),
                                  dates.var=DateofSubmission,
                                  date.format="%d/%m/%Y",
                                  data=lab.daily)

my.syndromicD <- raw_to_syndromicD (id=SubmissionID,
                                  syndromes.var=Syndrome,
                                  syndromes.name=c("GIT","Musculoskeletal","NonExisting"),
                                  dates.var=DateofSubmission,
                                  date.format="%d/%m/%Y",
                                  data=lab.daily)

my.syndromicD <- raw_to_syndromicD (id=SubmissionID,
                                  syndromes.var=Syndrome,
                                  dates.var=DateofSubmission,
                                  min.date="01/01/2011",
                                  date.format="%d/%m/%Y",
                                  remove.dow=c(6,0),
                                  add.to=c(2,1),
                                  data=lab.daily)

nandadorea/vetsyn documentation built on April 30, 2022, 1:15 a.m.