Nowcasting for decision making

TL;DR

This is a short example showing how helpful can be a to use nowcasting estimate to help the decision-making process

Nowcasting as a tool to support decision making

Nowcasting a rising curve or a curve on any other moment can give quantitative support for decision making, during the public health crises, the most needed is a way to anticipate, at least, what it is happening at the moment. Nowcasting is the tool for this type of questioning and can gives insights on the data to support needed decisions.

We start this section by cutting the original data at a moment of apparent decaying of the SARI hospitalization, for the city of Belo Horizonte, which had a prompt starting response to the Covid-19 pandemic. The pressure on the health system took more time than the rest of the country, and the data at same time were showing a decay. We filter all cases entered until 4th of July of 2020 by the date of digitization, a date that the cases shows up in the database.

library(dplyr)
library(tidyr)
library(ggplot2)
library(lubridate)
library(nowcaster)

## To see Nowcasting as if we were on the verge of rise in the curve
data("sragBH")
srag_now<-sragBH |> 
  filter(DT_DIGITA <= "2020-07-04")

data_by_week <- data.w_no_age(dataset = srag_now, 
                              date_onset = DT_SIN_PRI, 
                              date_report = DT_DIGITA) |> 
  group_by(date_onset) |> 
  tally()

data_by_week |>  
  ggplot(aes(x = date_onset, 
             y = n))+
  geom_line()+
  theme_bw()+
  labs(x = 'Date of onset of symptons', 
       y = 'Nº Cases')+
  scale_color_manual(values = c('grey50', 'black'), 
                     name = '')+
  scale_x_date(date_breaks = '2 weeks', 
               date_labels = '%V/%y', 
               name = 'Date in Weeks')

On this filtered data, we estimate the cases already that started its date of onset of symptoms but were not yet reported, so there not in the database. We just pass to the nowcasting_inla function, the data set filtered, flag for the columns where are the date_onset and date_report, we add the flag for the function return back the epidemic curve by epiweek.

nowcasting_bh_no_age <- nowcasting_inla(dataset = srag_now, 
                                        date_onset = DT_SIN_PRI, 
                                        date_report = DT_DIGITA, 
                                        data.by.week = T)
head(nowcasting_bh_no_age$data)
#> # A tibble: 6 × 4
#>   dt_event   delay     Y  Time
#>   <date>     <dbl> <dbl> <int>
#> 1 2019-12-28     0     1     1
#> 2 2019-12-28     1     2     1
#> 3 2019-12-28     2     4     1
#> 4 2019-12-28     3     5     1
#> 5 2019-12-28     4     2     1
#> 6 2019-12-28     5     2     1

Before we see the result of the nowcasting estimate we take a look on intermediate part of the process of nowcasting, the delay triangle, which sets the objects for nowcasting. The delay triangle is only a table where each unique amount of delay, (i.e. integer numbers of days or weeks) has passed between date of onset and date of report spread over each date of onset. The part that is closer to the present has less counts and has a lower amount of delay, this trivial due to, as the system takes time to process the cases, the most newer cases are lesser than the older ones, that already time to be processed.

From the data in weekly format we mount the counts of cases by the amount of delay. By tabling the delay amount against the data of onset of first symptoms, to see the pattern of the delay for the cases.

library(dplyr)

data_triangle <- nowcasting_bh_no_age$data |> 
  filter(delay < 30) |> 
  arrange(delay) |> 
  select(-Time)

data_triangle |> 
  filter(dt_event >= (max(dt_event) - 84),
         delay <= 10) |> 
  tidyr::spread(key = delay, value = Y)
#> # A tibble: 13 × 12
#>    dt_event     `0`   `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`   `9`  `10`
#>    <date>     <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1 2020-04-04    15   103    38    25    24     8    10     8     5     4     7
#>  2 2020-04-11    24    79    63    30    23    21     7    12     3     4     0
#>  3 2020-04-18    17   111    80    22    15    10    16     3     5     5     1
#>  4 2020-04-25    22   143    51    26    17    16     4    12     7     4    NA
#>  5 2020-05-02    39   106    87    24     9     5     8     6     1    NA    NA
#>  6 2020-05-09    37   154    68    32     8    12    10    10    NA    NA    NA
#>  7 2020-05-16    31   153    80    20    22    15    17    NA    NA    NA    NA
#>  8 2020-05-23    41   154    70    67    24    15    NA    NA    NA    NA    NA
#>  9 2020-05-30    34   174   191    98    35    NA    NA    NA    NA    NA    NA
#> 10 2020-06-06    22   240   232    62    NA    NA    NA    NA    NA    NA    NA
#> 11 2020-06-13    44   355   252    NA    NA    NA    NA    NA    NA    NA    NA
#> 12 2020-06-20    61   363    NA    NA    NA    NA    NA    NA    NA    NA    NA
#> 13 2020-06-27    84    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA

We just look at the amount of cases with 30 weeks of delay or less, it is the default maximum delay considered at nowcasting estimation. It can be changed by the parameter Dmax.

If this element is grouped by and summarized by the onset of symptoms date, here DT_SIN_PRI, it is the epidemiological curve observed. To example it, we plot the estimate and the epidemiological curve all together.

library(ggplot2)

dados_by_week <- nowcasting_bh_no_age$data |> 
  dplyr::group_by(dt_event) |> 
  dplyr::reframe(
    observed = sum(Y, na.rm = T)
  )

nowcasting_bh_no_age$total |> 
  ggplot(aes(x = dt_event, y = Median, 
             col = 'Nowcasting')) +
  geom_line(data = dados_by_week, 
            aes(x = dt_event, y = observed, 
                col = 'Observed'))+
  geom_ribbon(aes(ymin = LI, ymax = LS, col = NA), 
              alpha = 0.2, 
              show.legend = F)+
  geom_line()+
  theme_bw()+
  theme(legend.position = "bottom", 
        axis.text.x = element_text(angle = 90)) +
  scale_color_manual(values = c('grey50', 'black'), 
                     name = '')+
  scale_x_date(date_breaks = '2 weeks', 
               date_labels = '%V/%y', 
               name = 'Date in Weeks')+
  labs(x = '', 
       y = 'Nº Cases')

And as expected, the nowcasting estimated a rising on curve when it were observed a decaying. Adding to the plot what actually has happened in that period, with the data inserted after the period for when the nowcasting estimated the rising in the curve for SARI hospitalizations.

nowcasting_bh_no_age$total %>% 
  ggplot(aes(x = dt_event, y = Median, col = 'Nowcasting')) +
  geom_line(data = dados_by_week, 
            aes(x = dt_event, y = observed, col = 'Observed'))+
  geom_ribbon(aes(ymin = LI, ymax = LS, col = NA), 
              alpha = 0.2, 
              show.legend = F)+
  geom_line()+
  geom_line( data = sragBH %>%
               filter(DT_SIN_PRI <= "2020-07-04") %>% 
               mutate(
                 D_SIN_PRI_2 = DT_SIN_PRI - as.numeric(format(DT_SIN_PRI, "%w"))
               ) %>% 
               group_by(D_SIN_PRI_2) %>% 
               tally(), 
             mapping = aes(x = D_SIN_PRI_2, y = n, 
                           color = "Observed after one year")) + 
  theme_bw() +
  theme(legend.position = "bottom", 
        axis.text.x = element_text(angle = 90)) +
  scale_color_manual(values = c('grey50', 'black', 'red'), 
                     name = '')+
  scale_x_date(date_breaks = '2 weeks', 
               date_labels = '%V/%y', 
               name = 'Date in Weeks')+
  labs(x = '', 
       y = 'Nº Cases')

This end the first simple example when estimating the already started events but not yet reported (i.e. nowcasting). The relevance of nowcasting for public health decision is given by the understanding that what is present on the databases are only a picture of the real time situation. The above graph can help policy makers on what decisions they can take in face of a rising curve of cases, hospitalizations or deaths.

Conclusion

On this vignettes we make the case for using nowcasting as a helping tool for decision making, by accessing the estimate with data a year after it was observed we check the goodness of the estimates produced, showing the nowcasting had produced at the time of observation of the data a good estimate of the true number of cases happening.

Rafael Lopes & Leonardo Bastos

TL;DR

Nowcasting as a tool to support decision making

Conclusion