1 Introduction
There is a trend to forecast COVID19 using mathematical models for the probability of moving between states from susceptible to infected, and then to a recovered state or death (SIR models). This approach is however very sensitive to starting assumptions and tend to overestimated the virus reproductive rate. One key point that these models miss is the individual behavioral responses and governmentmandated policies that can dramatically influence the course of the epidemic. In Wuhan, for instance, strict social distancing was instituted on January 23rd, 2020, and by March 15th new infections were close to zero. Taking into account this observation,
COVID et al. (2020) have proposed a statistical approach to model a empirical cumulative population death rate. However, modelling observed cumulative death numbers has the inherent problem of yielding a highly correlated data which, if it is not taken into account, could cause misleading inference results.Instead of considering a mathematical model such as SIR, that rely heavily on parameters assumptions, we will directly work as in COVID et al. (2020) or (Zhou et al., 2020)
with empirical data. To this end, we propose to model the daily number of deaths using a Poisson distribution with a rate parameter that is proportional to a Skew Normal density. Using a Bayesian approach and a prior epidemic China COVID19 history, we forecast the total number of deaths in Peru for the next seventy days.
2 Methods
2.1 Model
Let be the number of COVID19 deaths at times , where time is measured from the first reported death due to COVID19, and let us suppose that the death rate will hit a platoon due to the government intervention. We propose then the model
(1) 
with death rate
(2) 
where
denotes the density function of a Skew Normal distribution with location, scale, and shape parameters,
, , and , respectively; is a maximum asymptotic level parameter andis the population size. The choice of the Skew Normal distribution is motivated by its flexibility on the tails and their asymmetry, which can force, as one could expect, a rapidly increase rate at the first stages of the pandemic and a slower decrease of this rate at the last stages. A negative binomial distribution can also be considered for the deaths numbers, but we found empirically a better fit with the Poisson model.
Our model differs of the approach taken by COVID et al. (2020), who directly models the cumulative death rate
with a term proportional to a symmetric cumulative normal distribution. This difference is not only found in the formulation, but also in the estimation procedure. While these authors incorporated first the Chinese data–more concretely the time from when the initial death rate exceeds 1e15 to the implementation of social distancing– into their model throughout a locationspecific inflection point parameter or a maximum death rate and then used a sort of credibility model between shortrange and longrate variants, we followed a Bayesian approach that incorporates the Chinese data as a prior distribution. The posterior predictive distribution, which is the goal of this model, is then a dynamic object that can be updated with new data about the number of deaths or any other reliable information that may be incorporated as covariates in the model.
Apart from the total number of deaths, three main quantities of interest can be easily derived from our model. First, a time to threshold death rate, which provides the time after which only 0.01
of deaths will be observed on the population. This will be defined as the 0.99 quantile of the
distribution. Another characteristic of interest is the inflection point, defined as the time at which the death rate reaches its maximum level.2.2 Priors
Since China was the first country to have experienced a drastic drop in infections and deaths, we are proposing to incorporate this data, into our Peruvian death rate predictions, through a prior distribution. Figure 1 shows the empirical distribution of number of reported deaths in China. One can notice that the reported numbers on February 13 and 14 (red points) were 254 and 13 deaths, respectively. There is some controversy^{2}^{2}2https://www.bbc.com/news/world51495484 about the reported numbers from China on these days, the reason why we are considering the average number for these days.
Figure 2
shows the observed and predicted death rates (A) and the daily number of deaths (B) in China. The predicted rates and their associated 95% prediction credible intervals were obtained, under a Bayesian approach, by considering a noninformative prior and the Chinese official death reports. Table
1 summarizes the information given in Figure 2. This information will be considered as a prior for modelling the number of deaths in Perú with the exception of the parameter, where a weakly informative prior, , will be considered for .Mean  SD  2.5%  50%  97.5%  

16.52  0.37  15.81  16.52  17.24  
2.91  0.02  2.87  2.91  2.95  
2.34  0.14  2.07  2.34  2.64  
0.87  0.02  0.84  0.87  0.91 
China: Mean, standard deviation (SD) and quantiles 0.25,0.5 and 0.975 of the posterior distribution
Initial values for the estimation process will be sampled from the prior distribution that was constructed with the Chinese data.
2.3 Data
The Chinese data to be used in this work was obtained from the European Centre for Disease Prevention and Control, institution that daily publishes statistics on the COVID19 pandemic. The daily death reports in Perú, on the other hand, were obtained from the local authorities.
2.4 Inference
Taking into account the observed likelihood function, easily derived from (1) and (2
), the posterior distribution of the vector parameter
has a complex expression. To obtain samples from it, we will make use of Hamiltonian MCMC methods as the ones implemented in the R package RStan (Stan Development Team, 2018). This package implements an adaptive Hamiltonian Monte Carlo algorithm (also known as HMC) using a NoUTurn sampler (NUTS) for the stepsize parameter in order to generate efficient transitions to the posterior distribution (Carpenter, Gelman, Hoffman, Lee, Goodrich, Betancourt, Brubaker, Guo, Li and Riddell, 2017).3 Results: Forecasting for Perú
We now briefly describe our predictions for the spread of COVID19 in Perú. The time here will be understood to be measured in days after the first reported death in the country.
3.1 Number of deaths
Figure 3.A shows the estimated cumulative number of deaths for Perú and its associated 95% predictive credibility interval. In particular, the expected total number of deaths is 141.8 (95%CI: 99  192) and 339.7 (95% CI: 239  462) after 20 and 30 days, respectively.
In addition, Figure 3.B shows the expected number of deaths per day and its associated 95% predictive credibility interval. In particular, the expected number of deaths is 17.5 (95%CI: 9  28) and 19.3 (95% CI: 10  31) at 20 and 30 days, respectively. Overall, the total number of deaths is expected to be 611.6 (95%CI: 604.3  833.7) persons.
3.2 Time to threshold death rate
The model estimates that the number of days, since the first death, to achieve the threshold death rate will be 63.8 (95%CI: 52.9  65.84) days. At this time, the 99% of expected deaths will have been observed.
3.3 Inflection point
The estimated inflection point is 25.9 (95%CI: 25.1  26.8), which means that we expect to spend approximately 26 days before the COVID19 death rate starts to decline after the first reported death case.
4 Sensitivity analysis
Three different scenarios were considered, each keeping the weakly informative prior lognormal distribution for
. In all cases, the posterior mean for the fitted Chinese model was taken as the mean prior for the modelling of data from Perú. The first scenario (I) takes also the posterior Chinese variances as the prior Peruvian variances. The second scenario (II) induces flexibility in the prior distribution by increasing their corresponding standard deviations by a factor of five. Finally, the third scenario (III) increases the standard deviations by a factor of ten.
Table 2 and Figure 4 shows that the point estimates are not heavily affected, but the precision pays the price of not having yet enough data from Perú.
Scenario  Time to threshold  Inflection point  Total number of deaths 

I  63.8 (61.9  65.8)  26.0 (25.1  26.8)  611.6 (437.2  833.7) 
II  64.7 (55.4  74.9)  26.2 (22.0  30.3)  568.7 (299.4  1037.5) 
III  66.6 (48.9  88.5)  26.6 (19.0  34.9)  629.6 (225.8  1588.4 ) 
5 Discussion
Considering the information on daily number of deaths from China and from Perú, our estimation of total number of deaths will be 611.6 (95% CI: 437.2  833.7) and 99% of those deaths will occur 63.8 (95% CI: 52.9  65.8) days after the first death reported case.
The present study has some limitations. Although Perú has not followed all the measures taken in China, we assumed that Peruvian interventions will have similar effects as the ones observed in that country. However, we expect our model is flexible enough to be driven by the Peruvian data. Furthermore, the model will increase its precision as more data from Perú becomes available.
Several extensions are possible for the model. For instance, covariates as daily social mobility indicators or country age distributions can be included in the model, in particular on the parameter that measures the total number of deaths.
We expect our model can be useful to guide some policies that need to be taken by the Peruvian government in order to overcome the COVID19 pandemic. For example, the proposed model can be useful to measure the impact of the COVID19 pandemic on the Peruvian health system.
6 Acknowledgement
We would like to thanks Rodrigo Carrillo Larco for providing us the detailed information of Perú based on reports by the Peruvian Ministry of Health and Dr. Daniel ManriqueVallier for earlier discussions and helpful suggestions on the topic.
References
 (1)
 Carpenter et al. (2017) Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P. and Riddell, A. (2017). Stan: A probabilistic programming language, Journal of statistical software 76(1).

COVID et al. (2020)
COVID, I., Murray, C. J. et al. (2020).
Forecasting covid19 impact on hospital beddays, icudays,
ventilatordays and deaths by us state in the next 4 months, medRxiv .
https://www.medrxiv.org/content/early/2020/03/30/2020.03.27.20043752 
Stan Development Team (2018)
Stan Development Team (2018).
RStan: the R interface to Stan.
R package version 2.18.2.
http://mcstan.org/  Zhou et al. (2020) Zhou, X., Hong, N., Ma, Y., He, J., Jiang, H., Liu, C., Shan, G., Su, L., Zhu, W. and Long, Y. (2020). Forecasting the worldwide spread of covid19 based on logistic model and seir model, medRxiv .
Comments
There are no comments yet.