raw_to_syndromic-methods: 'raw_to_syndromicW'

raw_to_syndromicWR Documentation

raw_to_syndromicW

Description

An object syndromicW (syndromic main class for data to be monitored weekly) can be created from data that were originally recorded with the WEEK of the event or when the DATE of the event was recorded, but the user wants to group events per week. The function will handle both cases, group events by week, and output counts by week. For data already grouped into the number of observations per week, please see the function syndromicW.

Usage

raw_to_syndromicW(id, syndromes.var, syndromes.name, dates.var,
  date.format = "%d/%m/%Y", min.date, max.date, sort = TRUE,
  data = NULL, formula = list())

Arguments

id

indicates a variable (or multiple variables) which should be used to identify unique events in the data. It can be provided as an R vector (p.e. mydata$myid), as the name of a DataFrame column (p.e. id=myid, data=my.data), or as multiple column names (p.e. id=list(id1,id2,id3), data=my.data).

syndromes.var

the variable that identifies group membership (in general the syndromic grouping). Can be numeric, character or factor.

syndromes.name

An optional argument providing the syndromic groups to be monitored. If not given, it is taken from the values found in syndromes.var. When syndromes.name IS provided, it should be provided as character value or vector (p.e. "Mastitis" or c("Mastitis","GIT") ).

dates.var

The vector (dates.var=mydata$mydates) or column name (dates.var=mydates, data=mydata) where the dates of the events are to be found. This parameter will be able to handle a column in which the DATE of the event was recorded or in which the WEEK of the event was recorded. In the latter case, however, the week MUST be in the ISOweek format (p.e., "2014-W02-1"). In any case the function will group events by week and outputs will be in the ISOweek format.

date.format

The date.format of the date.variable. Default is d/m/Y. See strptime() for format specifications. If the dates variable records the WEEKS of the events, please set date.format=ISOweek"

min.date

An optional argument. If not provided, the minimum date found in the dataset is used. Must be provided in the same date format as set n "date.format", that is, the same format for dates when the DATE of the event was recorded, or ISOweek.

max.date

An optional argument. If not provided, the maximum date found in the dataset is used.As with min.date, can be a date or a ISOweek.

sort

Default is true, which organizes the groups found in syndromes.name alphabetically. If set to FALSE, groups are listed in the order they are found in the dataset or provided in syndromes.name.

data

Optional argument. If used the other arguments can be specified as column names within the dataset provided through this argument

formula

A formula, or list of formulas, specifying the regression formula to be used when removing temporal patterns from each of the syndromes in @observed. For instance formula=list(y~dow+mon) for a single syndrome, where regression must take into account the variables dow (day-of-week) and month; or formula=c(y~dow, y~dow+mon) specifying two different formulas for two syndromes. The names of the variables given should exist in the columns of the slot @dates. Make sure that formulas' index match the columns in observed (for instance the second formula should correspond to the second syndrome, or second column in the observed matrix).You can provide NA for syndromes which should not be associated with any formula. This parameter is often only filled after some analysis in the data, not at the time of object creation.

Details

This function will count the number of cases for one or more defined groups, weekly. Weeks without counts will be assigned a count of zero, generating a complete sequence of weeks. The complete sequence will start at the minimum week found in the dataset and end at the maximum week, by default. However it is also possible to provide a minimum DATE EARLIER than the minimum in the dataset(since the original data were recorded based on the date, it's assumed that the user may wanst to establish cut-offs based on the dates from the original data, not weeks, which is the format of the output). It is also possible to provide a maximum date LATER than the latest recorded. The extra weeks created are assigned counts of zero (minimum or maximum dates already within the range of the dataset are ignored).

The raw, observed data, are assumed to be stored in a data.frame in which each observed event (for instance a laboratory submission) is recorded in one or multiple rows. Unique events can be identified by one unique ID. It is possible however to take into consideration an hierarchical organization of the data, by which an unique ID can only be verified taking into account multiple columns (p.e. animal ID is unique within farm, but not between farms, therefore the IDs are unique combinations of the variables "farm" and "animal").

Multiple events with the same unique ID are acceptable, but counted only once per time unit (p.e. WEEK). Besides removing duplicated events, the function also completes missing weeks, assigning them a count of zero.

The function counts the number of events, per week, for each of the groups found in the variable syndromes.var. However, the variable syndromes.name can be used to RESTRICT the groups counted (if not all values appearing in the data are to be subjected to monitoring, p.e. when "nonspecific" or "non-classified" values exist); or to EXTEND the list to include values which did not appear in the dataset (this is the recommended use of this function for regular monitoring, in order to assure that groups with zero events in the specific data batch being analyzed will still be represented in the output of the function, though with zero counts every day.)

IMPORTANT: Please note that this function removed DUPLICATED records based on a repeat id, within the same DATE, since daily records are provided. If two cases with the same ID are recorded in the same week, but dfferent days, these will be counted as TWO CASES.To eliminate repeated cases within the same week, please convert the date to ISOweek format using the functions in this package, and use, instead, the function rawW_to_syndromicW.

Value

an object of the class syndromicW with the following slots: (1) OBSERVED: A matrix with as many columns as syndromic groups found in the dataset (or listed by the user); (2) DATES: A data frame where the first column contains the complete ISOweek of dates from the minimum to the maximum date found in the dataset (or set by the user), and additional columns contain additional date variables (such as day of numerical week and year) as generated by default when an object of the class syndromicW is created.

Examples

##data recorded with the DATE of the event
data(lab.daily)
my.syndromicW <- raw_to_syndromicW (id=lab.daily$SubmissionID,
                                  syndromes.var=lab.daily$Syndrome,
                                  dates.var=lab.daily$DateofSubmission,
                                  date.format="%d/%m/%Y")

my.syndromicW <- raw_to_syndromicW (id=SubmissionID,
                                  syndromes.var=Syndrome,
                                  dates.var=DateofSubmission,
                                  date.format="%d/%m/%Y",
                                  data=lab.daily)

my.syndromicW <- raw_to_syndromicW (id=list(HerdID,AnimalID),
                                  syndromes.var=Syndrome,
                                  dates.var=DateofSubmission,
                                  date.format="%d/%m/%Y",
                                  data=lab.daily)

my.syndromicW <- raw_to_syndromicW (id=SubmissionID,
                                  syndromes.var=Syndrome,
                                  syndromes.name=c("GIT","Musculoskeletal"),
                                  dates.var=DateofSubmission,
                                  date.format="%d/%m/%Y",
                                  data=lab.daily)

my.syndromicW <- raw_to_syndromicW (id=SubmissionID,
                                  syndromes.var=Syndrome,
                                  syndromes.name=c("GIT","Musculoskeletal","NonExisting"),
                                  dates.var=DateofSubmission,
                                  date.format="%d/%m/%Y",
                                  data=lab.daily)

my.syndromicW <- raw_to_syndromicW (id=SubmissionID,
                                  syndromes.var=Syndrome,
                                  dates.var=DateofSubmission,
                                  min.date="01/01/2011",
                                  date.format="%d/%m/%Y",
                                  data=lab.daily)
                                  
##data recorded with the WEEK of the event
my.syndromicW <- raw_to_syndromicW (id=lab.weekly$SubmissionID,
                                    syndromes.var=lab.weekly$Syndrome,
                                    dates.var=lab.weekly$DateofSubmission,
                                    date.format="ISOweek")

my.syndromicW <- raw_to_syndromicW (id=SubmissionID,
                                   syndromes.var=Syndrome,
                                   dates.var=DateofSubmission,
                                   date.format="ISOweek",
                                   data=lab.weekly)

my.syndromicW <- raw_to_syndromicW (id=list(HerdID,AnimalID),
                                   syndromes.var=Syndrome,
                                   dates.var=DateofSubmission,
                                   date.format="ISOweek",
                                   data=lab.weekly)

my.syndromicW <- raw_to_syndromicW (id=SubmissionID,
                                   syndromes.var=Syndrome,
                                   syndromes.name=c("GIT","Musculoskeletal"),
                                   dates.var=DateofSubmission,
                                   date.format="ISOweek",
                                   data=lab.weekly)

my.syndromicW <- raw_to_syndromicW (id=SubmissionID,
                                   syndromes.var=Syndrome,
                                   syndromes.name=c("GIT","Musculoskeletal","NonExisting"),
                                   dates.var=DateofSubmission,
                                   date.format="ISOweek",
                                   data=lab.weekly)

my.syndromicW <- raw_to_syndromicW (id=SubmissionID,
                                   syndromes.var=Syndrome,
                                   dates.var=DateofSubmission,
                                   date.format="ISOweek",
                                   min.date="2010-W50-1",
                                   data=lab.weekly)

nandadorea/vetsyn documentation built on April 30, 2022, 1:15 a.m.