Skip to contents

TL;DR

This is a short example showing how helpful it can be to use nowcasting estimate to help the decision-making process

Nowcasting as a tool to support decision making

Nowcasting a rising curve or a curve at any other moment can give quantitative support for decision making, during the public health crises, the most needed is a way to anticipate, at least, what it is happening at the moment. Nowcasting is the tool for this type of questioning and can give insights into the data to support the needed decisions.

We start this section by cutting the original data at a moment of apparent decaying of the SARI hospitalization, for the city of Belo Horizonte, which had a prompt response to COVID-19 pandemic. The pressure on the health system took more time than the rest of the country, and the data at the same time were showing a decay. We filter all cases entered until the 4th of July 2020 by the date of digitization, a date that the case shows up in the database.

library(tidyverse)
library(lubridate)
library(nowcaster)

## To see Nowcasting as if we were on the verge of a rise in the curve
data("sragBH")
srag_now<-sragBH |> 
  filter(DT_DIGITA <= "2020-07-04")

data_by_week <- data.w_no_age(dataset = srag_now, 
                              date_onset = DT_SIN_PRI, 
                              date_report = DT_DIGITA) |> 
  group_by(date_onset) |> 
  tally()

data_by_week |>  
  ggplot(aes(x = date_onset, 
             y = n))+
  geom_line()+
  theme_bw()+
  labs(x = 'Date of onset of symptons', 
       y = 'Nº Cases')+
  scale_color_manual(values = c('grey50', 'black'), 
                     name = '')+
  scale_x_date(date_breaks = '2 weeks', 
               date_labels = '%V/%y', 
               name = 'Date in Weeks')

On this filtered data, we estimate the cases that started their date of onset of symptoms, but were not yet reported, so they are not in the database. We just pass to the nowcasting_inla function, the data set filtered, flag for the columns where the date_onset and date_report, we add the flag for the function, return back the epidemic curve by epiweek.

nowcasting_bh_no_age <- nowcasting_inla(dataset = srag_now, 
                                        date_onset = DT_SIN_PRI, 
                                        date_report = DT_DIGITA, 
                                        data.by.week = T)
head(nowcasting_bh_no_age$data)
#> # A tibble: 6 × 4
#>   dt_event   delay     Y  Time
#>   <date>     <dbl> <dbl> <int>
#> 1 2019-12-28     0     1     1
#> 2 2019-12-28     1     2     1
#> 3 2019-12-28     2     4     1
#> 4 2019-12-28     3     5     1
#> 5 2019-12-28     4     2     1
#> 6 2019-12-28     5     2     1

Before we see the result of the nowcasting estimate, we take a look at intermediate part of the process of nowcasting, the delay triangle, which sets the objects for nowcasting. The delay triangle is only a table where each unique amount of delay, (i.e., integer numbers of days for weeks) has passed betweenthe date of onset and the date of report spread over each date of onset. The part that is closer to the present has less counts and has a lower amount of delay, this is trivial due to, as the the system takes time to process the cases, the newer cases are fewer than the older ones, which are already time to be processed.

From the data in weekly format, we mount the counts of cases by the amount of delay. By tabulating the delay amount against the data of onset of the first symptoms, to see the pattern of the delay for the cases.

library(dplyr)

data_triangle <- nowcasting_bh_no_age$data |> 
  filter(delay < 30) |> 
  arrange(delay) |> 
  select(-Time)

data_triangle |> 
  filter(dt_event >= (max(dt_event) - 84),
         delay <= 10) |> 
  tidyr::spread(key = delay, value = Y)
#> # A tibble: 13 × 12
#>    dt_event     `0`   `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`   `9`  `10`
#>    <date>     <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1 2020-04-04    15   103    38    25    24     8    10     8     5     4     7
#>  2 2020-04-11    24    79    63    30    23    21     7    12     3     4     0
#>  3 2020-04-18    17   111    80    22    15    10    16     3     5     5     1
#>  4 2020-04-25    22   143    51    26    17    16     4    12     7     4    NA
#>  5 2020-05-02    39   106    87    24     9     5     8     6     1    NA    NA
#>  6 2020-05-09    37   154    68    32     8    12    10    10    NA    NA    NA
#>  7 2020-05-16    31   153    80    20    22    15    17    NA    NA    NA    NA
#>  8 2020-05-23    41   154    70    67    24    15    NA    NA    NA    NA    NA
#>  9 2020-05-30    34   174   191    98    35    NA    NA    NA    NA    NA    NA
#> 10 2020-06-06    22   240   232    62    NA    NA    NA    NA    NA    NA    NA
#> 11 2020-06-13    44   355   252    NA    NA    NA    NA    NA    NA    NA    NA
#> 12 2020-06-20    61   363    NA    NA    NA    NA    NA    NA    NA    NA    NA
#> 13 2020-06-27    84    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA

We just look at the number of cases with 30 weeks of delay or less, it is the default maximum delay considered in nowcasting estimation. It can be changed by the parameter Dmax.

If this element is grouped by and summarized by the onset of symptoms date, here DT_SIN_PRI, it is the epidemiological curve observed. To exemplify it, we plot the estimate and the epidemiological curve altogether.

library(ggplot2)

dados_by_week <- nowcasting_bh_no_age$data |> 
  dplyr::group_by(dt_event) |> 
  dplyr::reframe(
    observed = sum(Y, na.rm = T)
  )

nowcasting_bh_no_age$total |> 
  ggplot(aes(x = dt_event, y = Median, 
             col = 'Nowcasting')) +
  geom_line(data = dados_by_week, 
            aes(x = dt_event, y = observed, 
                col = 'Observed'))+
  geom_ribbon(aes(ymin = LI, ymax = LS, col = NA), 
              alpha = 0.2, 
              show.legend = F)+
  geom_line()+
  theme_bw()+
  theme(legend.position = "bottom", 
        axis.text.x = element_text(angle = 90)) +
  scale_color_manual(values = c('grey50', 'black'), 
                     name = '')+
  scale_x_date(date_breaks = '2 weeks', 
               date_labels = '%V/%y', 
               name = 'Date in Weeks')+
  labs(x = '', 
       y = 'Nº Cases')

And as expected, the nowcasting estimated a rising curve when it a decaying was observed. Adding to the plot, what has happened in that period, with the data inserted after the period for when the nowcasting estimated the rise in the curve for SARI hospitalizations.

nowcasting_bh_no_age$total %>% 
  ggplot(aes(x = dt_event, y = Median, col = 'Nowcasting')) +
  geom_line(data = dados_by_week, 
            aes(x = dt_event, y = observed, col = 'Observed'))+
  geom_ribbon(aes(ymin = LI, ymax = LS, col = NA), 
              alpha = 0.2, 
              show.legend = F)+
  geom_line()+
  geom_line( data = sragBH %>%
               filter(DT_SIN_PRI <= "2020-07-04") %>% 
               mutate(
                 D_SIN_PRI_2 = DT_SIN_PRI - as.numeric(format(DT_SIN_PRI, "%w"))
               ) %>% 
               group_by(D_SIN_PRI_2) %>% 
               tally(), 
             mapping = aes(x = D_SIN_PRI_2, y = n, 
                           color = "Observed after one year")) + 
  theme_bw() +
  theme(legend.position = "bottom", 
        axis.text.x = element_text(angle = 90)) +
  scale_color_manual(values = c('grey50', 'black', 'red'), 
                     name = '')+
  scale_x_date(date_breaks = '2 weeks', 
               date_labels = '%V/%y', 
               name = 'Date in Weeks')+
  labs(x = '', 
       y = 'Nº Cases')

This ends the first simple example when estimating the already started events but not yet reported (i.e., nowcasting). The relevance of nowcasting for public health decisions is given by the understanding that what is present on the databases is only a picture of the real time situation. The above graph can help policymakers with what decisions they can take in the face of a rising curve of cases, hospitalizations, or deaths.

Conclusion

In this vignette,s we make the case for using nowcasting as a helping tool for decision making, by accessing the estimate with data from a year after it was observed, we checked the goodness of the estimates produced, showing that the nowcasting had been produced at the time of observation of the data provides a good estimate of the true number of cases happening.