In covid19br/nowcaster: Nowcaster

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>", 
  warning = F, 
  message = F, 
  echo = T
)

First example on LazyData

When the package is loaded it provides a LazyData file, sariBH, it is a anonymized records of Severe Acute Respiratory Illness notified in the city of Belo Horizonte, since March 2020 to April 2022. To load it basically write:

library(nowcaster)
# Loading Belo Horizonte SARI dataset
data(sragBH)

And we take a look on the data:

head(sragBH)

It is a data.frame with 7 variables and 65,404 observations. We will make use of only the first two columns, "DT_SIN_PRI" (date of onset symptoms) and "DT_DIGITA" (recording date) as well the column "Idade" (age in years) to make age structured nowcasting.

The call of the function is straightforward, it simply needs a dataset as input, here the LazyData loaded in the namespace of the package. The function has 3 mandatory parameters, dataset for the parsing of the dataset to be nowcasted, date_onset for parsing the column name which is the date of onset of symptoms and date_report which parses the column name for the date of report of the cases. Here this columns are "DT_SIN_PRI" and "DT_DIGITA", respectively.

nowcasting_bh_no_age <- nowcasting_inla(dataset = sragBH, 
                                        date_onset = "DT_SIN_PRI", 
                                        date_report = "DT_DIGITA", 
                                        data.by.week = T)
head(nowcasting_bh_no_age$total)

This calling will return for the first element the nowcasting estimate and its Confidence Interval (CI) for two different Credible interval, LIb and LSb are the max and min CI, respectively, with credibility of 50% and LI and LS are the max and min CI, respectively, with credibility of 95%.

On the second element it returns the data to be grouped and summarized to give the epidemic curve, we can take a look on this element.

library(ggplot2)
library(dplyr)

dados_by_week <- nowcasting_bh_no_age$data |> 
  dplyr::group_by(dt_event) |> 
  dplyr::reframe(
    observed = sum(Y, na.rm = T)
  ) |>
  dplyr::filter(dt_event >= max(dt_event)-270)

dados_by_week |> 
  ggplot()+
  geom_line(data = dados_by_week, 
            aes(dt_event, 
                y = observed, 
                col = 'Observed'))+
  theme_bw()+
  theme(legend.position = "bottom", 
        axis.text.x = element_text(angle = 90)) +
  scale_color_manual(values = c('grey50', 'black'), 
                     name = '')+
  scale_x_date(date_breaks = '2 weeks', 
               date_labels = '%V/%y', 
               name = 'Date in Weeks')+
  labs(x = '', 
       y = 'Nº Cases')

After this element is groped by and summarized by the onset of symptoms date, here DT_SIN_PRI, it is the epidemiological curve observed. To example how the estimate compares with the observed curve, we plot the estimate and the epidemiological curve all together.

nowcasting_bh_no_age$total |> 
  ggplot(aes(x = dt_event, y = Median, 
             col = 'Nowcasting')) +
  geom_line(data = dados_by_week, 
            aes(y = observed, 
                col = 'Observed'))+
  geom_ribbon(aes(ymin = LI, ymax = LS, col = NA), 
              alpha = 0.2, 
              show.legend = F)+
  geom_line()+
  theme_bw()+
  theme(legend.position = "bottom", 
        axis.text.x = element_text(angle = 90)) +
  scale_color_manual(values = c('grey50', 'black'), 
                     name = '')+
  scale_x_date(date_breaks = '2 weeks', 
               date_labels = '%V/%y', 
               name = 'Date in Weeks')+
  labs(x = '', 
       y = 'Nº Cases')