Nowcasting for decision making
Rafael Lopes & Leonardo Bastos
Source:vignettes/articles/2_nowcasting_importance.Rmd
2_nowcasting_importance.Rmd
TL;DR
This is a short example showing how helpful it can be to use nowcasting estimate to help the decision-making process
Nowcasting as a tool to support decision making
Nowcasting a rising curve or a curve at any other moment can give quantitative support for decision making, during the public health crises, the most needed is a way to anticipate, at least, what it is happening at the moment. Nowcasting is the tool for this type of questioning and can give insights into the data to support the needed decisions.
We start this section by cutting the original data at a moment of apparent decaying of the SARI hospitalization, for the city of Belo Horizonte, which had a prompt response to COVID-19 pandemic. The pressure on the health system took more time than the rest of the country, and the data at the same time were showing a decay. We filter all cases entered until the 4th of July 2020 by the date of digitization, a date that the case shows up in the database.
library(tidyverse)
library(lubridate)
library(nowcaster)
## To see Nowcasting as if we were on the verge of a rise in the curve
data("sragBH")
srag_now<-sragBH |>
filter(DT_DIGITA <= "2020-07-04")
data_by_week <- data.w_no_age(dataset = srag_now,
date_onset = DT_SIN_PRI,
date_report = DT_DIGITA) |>
group_by(date_onset) |>
tally()
data_by_week |>
ggplot(aes(x = date_onset,
y = n))+
geom_line()+
theme_bw()+
labs(x = 'Date of onset of symptons',
y = 'Nº Cases')+
scale_color_manual(values = c('grey50', 'black'),
name = '')+
scale_x_date(date_breaks = '2 weeks',
date_labels = '%V/%y',
name = 'Date in Weeks')
On this filtered data, we estimate the cases that started their
date of onset of symptoms, but were not yet reported, so they are not in the
database. We just pass to the nowcasting_inla
function, the
data set filtered, flag for the columns where the
date_onset
and date_report
, we add the flag
for the function, return back the epidemic curve by epiweek.
nowcasting_bh_no_age <- nowcasting_inla(dataset = srag_now,
date_onset = DT_SIN_PRI,
date_report = DT_DIGITA,
data.by.week = T)
head(nowcasting_bh_no_age$data)
#> # A tibble: 6 × 4
#> dt_event delay Y Time
#> <date> <dbl> <dbl> <int>
#> 1 2019-12-28 0 1 1
#> 2 2019-12-28 1 2 1
#> 3 2019-12-28 2 4 1
#> 4 2019-12-28 3 5 1
#> 5 2019-12-28 4 2 1
#> 6 2019-12-28 5 2 1
Before we see the result of the nowcasting estimate, we take a look at intermediate part of the process of nowcasting, the delay triangle, which sets the objects for nowcasting. The delay triangle is only a table where each unique amount of delay, (i.e., integer numbers of days for weeks) has passed betweenthe date of onset and the date of report spread over each date of onset. The part that is closer to the present has less counts and has a lower amount of delay, this is trivial due to, as the the system takes time to process the cases, the newer cases are fewer than the older ones, which are already time to be processed.
From the data in weekly format, we mount the counts of cases by the amount of delay. By tabulating the delay amount against the data of onset of the first symptoms, to see the pattern of the delay for the cases.
library(dplyr)
data_triangle <- nowcasting_bh_no_age$data |>
filter(delay < 30) |>
arrange(delay) |>
select(-Time)
data_triangle |>
filter(dt_event >= (max(dt_event) - 84),
delay <= 10) |>
tidyr::spread(key = delay, value = Y)
#> # A tibble: 13 × 12
#> dt_event `0` `1` `2` `3` `4` `5` `6` `7` `8` `9` `10`
#> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2020-04-04 15 103 38 25 24 8 10 8 5 4 7
#> 2 2020-04-11 24 79 63 30 23 21 7 12 3 4 0
#> 3 2020-04-18 17 111 80 22 15 10 16 3 5 5 1
#> 4 2020-04-25 22 143 51 26 17 16 4 12 7 4 NA
#> 5 2020-05-02 39 106 87 24 9 5 8 6 1 NA NA
#> 6 2020-05-09 37 154 68 32 8 12 10 10 NA NA NA
#> 7 2020-05-16 31 153 80 20 22 15 17 NA NA NA NA
#> 8 2020-05-23 41 154 70 67 24 15 NA NA NA NA NA
#> 9 2020-05-30 34 174 191 98 35 NA NA NA NA NA NA
#> 10 2020-06-06 22 240 232 62 NA NA NA NA NA NA NA
#> 11 2020-06-13 44 355 252 NA NA NA NA NA NA NA NA
#> 12 2020-06-20 61 363 NA NA NA NA NA NA NA NA NA
#> 13 2020-06-27 84 NA NA NA NA NA NA NA NA NA NA
We just look at the number of cases with 30 weeks of delay or less,
it is the default maximum delay considered in nowcasting estimation. It
can be changed by the parameter Dmax
.
If this element is grouped by and summarized by the onset of symptoms
date, here DT_SIN_PRI
, it is the epidemiological curve
observed. To exemplify it, we plot the estimate and the epidemiological
curve altogether.
library(ggplot2)
dados_by_week <- nowcasting_bh_no_age$data |>
dplyr::group_by(dt_event) |>
dplyr::reframe(
observed = sum(Y, na.rm = T)
)
nowcasting_bh_no_age$total |>
ggplot(aes(x = dt_event, y = Median,
col = 'Nowcasting')) +
geom_line(data = dados_by_week,
aes(x = dt_event, y = observed,
col = 'Observed'))+
geom_ribbon(aes(ymin = LI, ymax = LS, col = NA),
alpha = 0.2,
show.legend = F)+
geom_line()+
theme_bw()+
theme(legend.position = "bottom",
axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c('grey50', 'black'),
name = '')+
scale_x_date(date_breaks = '2 weeks',
date_labels = '%V/%y',
name = 'Date in Weeks')+
labs(x = '',
y = 'Nº Cases')
And as expected, the nowcasting estimated a rising curve when it a decaying was observed. Adding to the plot, what has happened in that period, with the data inserted after the period for when the nowcasting estimated the rise in the curve for SARI hospitalizations.
nowcasting_bh_no_age$total %>%
ggplot(aes(x = dt_event, y = Median, col = 'Nowcasting')) +
geom_line(data = dados_by_week,
aes(x = dt_event, y = observed, col = 'Observed'))+
geom_ribbon(aes(ymin = LI, ymax = LS, col = NA),
alpha = 0.2,
show.legend = F)+
geom_line()+
geom_line( data = sragBH %>%
filter(DT_SIN_PRI <= "2020-07-04") %>%
mutate(
D_SIN_PRI_2 = DT_SIN_PRI - as.numeric(format(DT_SIN_PRI, "%w"))
) %>%
group_by(D_SIN_PRI_2) %>%
tally(),
mapping = aes(x = D_SIN_PRI_2, y = n,
color = "Observed after one year")) +
theme_bw() +
theme(legend.position = "bottom",
axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c('grey50', 'black', 'red'),
name = '')+
scale_x_date(date_breaks = '2 weeks',
date_labels = '%V/%y',
name = 'Date in Weeks')+
labs(x = '',
y = 'Nº Cases')
This ends the first simple example when estimating the already started events but not yet reported (i.e., nowcasting). The relevance of nowcasting for public health decisions is given by the understanding that what is present on the databases is only a picture of the real time situation. The above graph can help policymakers with what decisions they can take in the face of a rising curve of cases, hospitalizations, or deaths.
Conclusion
In this vignette,s we make the case for using nowcasting as a helping tool for decision making, by accessing the estimate with data from a year after it was observed, we checked the goodness of the estimates produced, showing that the nowcasting had been produced at the time of observation of the data provides a good estimate of the true number of cases happening.