Estimating rates of COVID-19 infection and also associated mortality is complicated due to unpredictabilities in case ascertainment. We execute a counterfactual time series analysis on in its entirety mortality data from communities in Italy, comparing the populace mortality in 2020 through previous years, to calculation mortality native COVID-19. We uncover that the number of COVID-19 deaths in Italy in 2020 till September 9 was 59,000–62,000, contrasted to the official variety of 36,000. The proportion of the populace that died was 0.29% in the most influenced region, Lombardia, and 0.57% in the most influenced province, Bergamo. Combining report test optimistic rates from Italy with estimates of infection fatality prices from the Diamond Princess cruise ship, we estimate the infection rate as 29% (95% trust interval 15–52%) in Lombardy, and also 72% (95% to trust interval 36–100%) in Bergamo.

You are watching: Death toll in italy due to corona

Download PDF


The COVID-19 pandemic is among the most pressing obstacles the human being is facing today. Regardless of a large number of infected individuals and confirmed deaths, huge uncertainties about the properties of the virus and also the epidemic still remain. In this short article we present an analysis of the mortality price in Italy in 2020, i beg your pardon we uncover has been significantly greater than in previous years. Italy was among the hardest-hit nations in the early stages that the pandemic through >400,000 confirmed cases and >36,000 COVID-attributed deaths as of mid-October 20201.

Several numbers in Italy present statistical peculiarities such together the case fatality rate (CFR, identified as the ratio between the variety of deaths attributed to COVID-19 and the number of positive tests), i beg your pardon is still at 10% in October 2020 and also was much greater in the earlier stages that the pandemic2, and also has led to beforehand estimates the high mortality3. The CFR is heavily affected by problems unrelated come the underlying disease, such as the extent of testing. A better metric is the infection fatality rate (IFR, the ratio between the variety of deaths and the total number of infections), the understanding of which is great to guide the public health response. The IFR, in addition to the populace fatality price (PFR, characterized as the ratio between the number of deaths and also the full population), enables us to estimate the infection price (IR, the fraction of the population that is infected), which estimates how wide-spread the illness is in the population and which informs federal government response.

Estimating IFR and also IR is challenging, both fan to minimal testing (hence, poorly known number of infections) and the skepticism in the number of fatalities attributed come COVID-19. Main data account for those that have actually been tested. However, over there may have been various other deaths that were no tested and went unrecorded, which would suggest an underestimate that the fatality rate by the official COVID-19 numbers.

Given the unpredictabilities in the main COVID-19 fatality rate, that is important to explore other courses for obtaining it. In this article, we propose a counterfactual analysis: we use historical mortality price data indigenous Italy to build models for the expected death rate in 2020 in the absence of the COVID-19 pandemic. We attribute the difference in between the observed mortality in 2020 and the predicted counterfactual come the COVID-19 pandemic. The models we employ account because that the historical year-to-year variability fan to seasonal effects such together the flu. We use two different models, each based on different assumptions around the basic data distribution. One model is based upon a conditional Gaussian procedure (hereafter, referred to as CGP model) and the other on a artificial Controls an approach (hereafter, SCM). Continual results from these approaches suggest that the outcomes are independent of the design assumptions.

Figure 1 mirrors the counterfactual predictions for all regions in Italy in 2020. Us plot predictions indigenous both, SCM (yellow) and also CGP (green). Because that comparison, we also show the historical 2015–2019 data and their average (gray), and the mortality in 2020 after accounting for the report COVID-19 deaths (black), i.e., the total reported mortality minus the reported COVID death count. We keep in mind that the SCM and also CGP techniques both trace the pre-pandemic data carefully (the latter technique is design to enhance the pre-pandemic data exactly as in-depth in the techniques section) when the historic mean estimates are normally higher. The mortality in Italy has actually been below-average in the first 2 months of 2020, more than likely owing to a milder 보다 usual flu season. This discrepancy displayed why SCM and CGP provide better counterfactuals than the an easy historical median estimate: castle take right into account this year’s pre-pandemic mortality and also exploit the time-correlations in the mortality rates allowing to make more accurate and specific predictions, forgiven the different presumptions made by this methods hold true. Wherein our predicted overfill mortality is lower than the report COVID-19 fatalities, we will usage the latter for our calculation of COVID-19 deaths. This an option only makes a statistically significant difference because that the an ar of Lazio and also age groups listed below 30 year of age in couple of other regions, however otherwise walk not impact our conclusions with any kind of significance. Figure 1 reflects a clear excess in mortality over the counterfactual predictions after the week ending on 22 February, when the very first COVID-19-related deaths to be reported in Italy. This excess is generally seen in the north regions, which room the hardest hit. In the remainder of this work, we emphasis on this regions and also the province of Bergamo (see also ref. 4 because that an previously analysis).


we show the observed weekly mortality due to all reasons for the duration of 1 January come June 27 (black) in every 20 regions in Italy, and our prediction for the meant mortality in the absence of COVID-19 (conditional Gaussian process (CGP) v its 1 and 2−σ error from the variance the the Gaussian model, i.e., 68% and 95% to trust interval, respectively, in green and synthetic controls technique (SCM) in orange). The first reported COVID-19 mortality occurred in the week ending on February 22 (thin red upright line). The historic data native 2015 come 2019 (blue) and also corresponding historic mean (gray) is shown for comparison and also are not a great fit come the it was observed pre-pandemic data. In the dashed-black line, we also show the it was observed mortality after removing report COVID-19 deaths.

In Fig. 2, we present the overabundance deaths end the meant counterfactual for every main of report data. We emphasis on the couple of regions the were hardest fight by the pandemic and which bring about the many statistically significant conclusions. Figure 2a reflects that the excess weekly mortality is significantly higher than the official COVID-19 deaths in every regions, together the beginning of the pandemic. We just have access to reported COVID-19 deaths in Bergamo approximately 1 might 2020 (shown in a dashed pink vertical line) and beyond that, us extrapolate it in the same proportion together Lombardia, the region that the district of Bergamo stays in. Due to the fact that May 2020, the estimated excess is much less than report COVID-19 fatalities for part regions, though it is mostly still regular with the 1−σ (68%) trust interval of our predictions (except in Piemonte and Toscana). This is to it is in expected, together we relocate away indigenous the intervention (the begin of the pandemic), which occurred in February, the counterfactual forecast becomes less accurate and will be much more heavily affected by the global mean. Because that these weeks and regions, whenever our predicted overfill is less than the report COVID-19 deaths, us will use the last for estimating fatality rates and also infection ratios (IRs). Figure 2b reflects the cumulative excess in mortality compared with the total reported COVID-19 deaths in ~ the finish of every week. All regions see a regular rise in overabundance deaths until beforehand May 2020. If we attribute these deaths come COVID-19 infections, this suggests that the worst influenced regions such as Lombardia and also Emilia-Romagna have actually likely underestimated the mortality by factors of 1.5, whereas other regions like Piemonte and also Toscana have actually underestimated mortality by a variable of 2. For many regions, the number of deaths has decreased significantly due to the fact that May 2020.

Fig. 2: excess mortality contrasted with reported COVID-19 deaths in areas of north Italy and the province of Bergamo.


(a) excess weekly deaths, and (b) cumulative excess deaths, end the guess counterfactual in comparison to the report COVID-19 deaths (in pink) because that the period since February 23rd (available COVID-19 data). Approximates from both the synthetic controls an approach (SCM, orange) and conditional Gaussian process (CGP, green) counterfactuals agree. We display 1 and 2−σ error (68% and 95% confidence interval) from the variance that the Gaussian model. We find that COVID-19 deaths are under-reported by multiple components for every period and every region. Us extrapolate the data excess past June 27th, i beg your pardon is the last week with accessible total mortality data (dashed-black line), through dashed-lines. To perform this, us make the conservative assumption after June 27 the the report COVID-19 deaths room accurate and account because that the overfill mortality end predicted trends.

In Figs. 2 and also 3, we have actually extrapolated our estimated excess from 27 June 5 September. Because the number of estimated overabundance deaths is consistent with reported COVID-19 deaths because that the critical 8 weeks because that all regions (except Piemonte and Toscana), us assume that the weekly overfill mortality is the same as the reported COVID-19 deaths for extrapolation after ~ 27 June. Based upon this, we calculation that the number of COVID-19 deaths in Italy is in between 59,000−62,000 as of 9 September 2020, much more than a factor of 1.5× higher than the official number. In the remainder of the paper, us will usage this extrapolation to calculation the age-dependent and populace fatality rates and also IRs.

In Fig. 3, we present the overfill mortality for different age groups in intervals of 10 years over the age of 40. We find some agreement between the estimated excess and the reported COVID-19 deaths below the age of 70, but observe a significant and boosting discrepancy for higher age groups. This seems to imply that testing and consequently probably additionally treatment has actually been an ext complete for lower period groups.


Attributing overfill deaths to COVID-19

To ago the presumption that excess deaths space a an effect of the pandemic, we create a correlation in between the everyday excess deaths end the counterfactual and the official COVID-19 deaths by method of regression analysis: we execute a two-parameter fit come the overfill deaths by permitting the main deaths to it is in scaled and also shifted. We infer the time-lag and amplitude the this fit through minimizing χ2. We find that ideal fits are obtained for time-lags of −6 days for Lombardia, −7 days because that Emilia-Romagna, −8 days because that Piemonte, and also −6 days because that Marche. The inferred amplitudes selection between 1.2 and also 1.6. We administer figures and details because that this analysis in the supplemental material. Offered that both data sets report the job of fatality not the job of report, the inferred time-lags indicate that the main COVID-19 mortality lags behind the full mortality. One feasible reasons because that this can be that hospital therapy postpones fatality on mean by number of days. A ramping increase of experimentation with time could also cause this behavior.

However, correlation is not causation and also attributing the excess death rate to COVID-19 is quiet a solid assumption. Hence, us discuss feasible caveats. COVID-19 has actually put huge pressure ~ above Italy’s medical system and also social services. This might have resulted in fatalities that could otherwise be averted, causing us to evaluate the COVID-19 deaths. However, the push on the medical device is local and most likely sustainable for regions with a low number of official COVID-19 deaths, favor Piemonte and Liguria. Instead, us consistently find a comparable and very large excess in mortality over the official counts in plenty of regions in Italy, which shows up to be independent that how difficult the region was hit (see Fig. 2b).

The temporal trend additionally lends a similar argument: the societal and also medical systems should function normally in the earliest step of the pandemic and also get increasingly stressed as the number of infections increases. We view that the fraction of deaths to let go by the reported COVID-19 fatalities is the highest possible in the beforehand stage, and also decreases together the number of reported epidemic increases. The report COVID-19 fatalities finally capture up with the estimated excess fatalities through the finish of April 2020. We present this in Fig. 4 where we to compare the fraction of deaths missed every week through the variety of COVID-19 reported hospitalizations (normalized v the maximum number of hospitalizations as much as 18 April 2020).


For the duration of the pandemic, we show per main the fraction of missed deaths (green) with matching 1−σ (68% CI) estimated from the variance of the Gaussian model, the number of hospitalizations (normalized with the maximum weekly hospitalization up to 18 April 2020, in orange) and the number of tests conducted (normalized together a portion of tests carried out in the mainly of 11–18 April 2020, in blue). We find that the missed portion goes under as the number of tests rises while the hospitalizations have actually remained continuously high in the last 4 weeks.

Our theory is the the overabundance deaths over official COVID-19 deaths space primarily as result of the lack of experimentation in the initial step of the pandemic. In Fig. 4, we likewise show the number of tests conducted every week as the fraction of tests conducted in the main of 11 April 2020. The trend supports our assertion that with an increase in testing as the pandemic evolves, the reported fatalities due to COVID-19 slowly catch up with the true present mortality and also the boosted pressure on medical systems did not have a statistically far-reaching effect top top the mortality.

There are additionally arguments that suggest we may have underestimated the COVID-19 fatality rate. Italy has been under lockdown since 9 march 2020, which might have diminished fatalities as result of other common causes such together road and also workplace accidents, or criminal activities. This can be studied by observing the fatality rate correlations v the lockdown data in areas with small or no infection, together as south Italy. There are several regions that carry out not present an excess fatality rate, but none that them show a deficit fatality rate write-up 9 in march 2020, so us assume the this result is negligible, specifically for age groups over working age.

Fatality and IRs

Having created that the it was observed excess deaths deserve to reasonably be attributed come COVID-19, we can use our estimates and also uncertainties of the overfill mortality from the CGP and SCM counterfactuals to calculate the fatality rates and infection fractions because that Italian regions. The left panel of Fig. 5 shows the PFR in various age groups, the total variety of excess mortality deaths attributable come COVID-19 together a portion of the population. We find a steep age dependence the PFR: in Bergamo province, 1.89%, 4.84%, and also 11.06% that the entire population in the age groups 70–79, 80–89, and also 90+, respectively, died. For the whole population, the PFR is 0.57% (and likewise 0.29% in Lombardia). Because the PFR coincides to the IFR if the infection portion is 1 (maximum possible), we suppose these numbers to be the most conservative lower limits on the (age dependent) IFR (Table 2).


(Left) population fatality rate (PFR) native the cumulative estimates divided by the regional population. (Center) lower bounds on infection fatality rate (IFR) using the maximum test positive rate (TPR) as an upper bound on epidemic fraction. (Right) estimates of the true IFR once normalizing the period 70–89 team to the Diamond Princess IFR (in shaded blue, with the corresponding Poisson error estimate). We additionally show estimates from verity et al.5 with equivalent (68% CI). In magenta, which gives less steep period dependence. In the center and right panel, the gray lines are weighted mean approximates for IFR with 1 sigma weighted conventional deviation bands. The horizontal lines are the age-averaged IFR because that the whole population. Error bars for all the regions and age groups are 1−σ (68% CI) error from the variance that the Gaussian model an unified with Poisson errors based on number of deaths that differs because that every an ar and age group. In all panels, we have actually staggered the clues horizontally for every period group for far better visibility.

Lower limits on IFR

The central panel the Fig. 5 shows the reduced bounds ~ above the IFR. Estimating the IFR from the PFR calls for the IR that the population. Here, we usage the check positivity price (TPR)—the fraction of confident to complete tests, as an calculation of the fraction of the infected population. Owing to a lack of testing and also the default of mainly testing human being with symptoms, this have to be an top bound ~ above the IR in the early stages of the pandemic. Because that every region, we usage the preferably of the accumulation TPR estimated up to 5 September together our estimate for the IR. This must be an top limit on the IR and hence offer a conservative lower bound ~ above the IFR. We additional assume the this ratio is age-independent in every region5. The age-averaged reduced bounds ~ above the IFR are displayed in Table 1, through the many robust calculation of 0.73 ± 0.08% IFR lower bound from Lombardia, continuous with 0.57% lower bound from Bergamo province.

IR and also IFR calibrated top top the Diamond Princess

The PFR can also be combined with an independent calculation of the IFR to obtain the IR through the relationship IR=PFR/IFR. At the time of writing, the only large dataset with complete testing and also hence unbiased estimate of the IFR is the Diamond Princess (DP) cruise ship. For our analysis, us assume that the age-dependent IFR is place independent: we account for age differences, yet not for other differences in between the DP and also Italian populaces in the same age group such as co-morbities, sheep differences, or health-care access.

The last fatality on the ship to be reported top top 18 April 2020 and also 11 out of 330 DP epidemic in the period group above 70 had been fatal (a couple of of the fatalities execute not have age information). This outcomes in one IFR because that this period group the 3.3% and we i think a Poisson circulation to calculation the errors. The population distribution in this age group on the DP was 80% in 70–79 and also 20% over 806. For each region of Italy, us re-weigh the populace to complement this age distribution and also hence match the age-weighted IFR to the DP in the 70–89 age group. Combine this through the corresponding PFR, we space able to calculation IRs for this period group. Then, under the presumption of age-independent IR, we can combine this approximated IR v observed PFR for other age groups to have IFR for all the other age groups (Table 1). IR selection from 4% up to 29% (15–52% 95% CI) in Lombardia and also 72% (36–100% 95% CI) in the province of Bergamo. In every cases, the approximated mean IR is below the upper limit set by the preferably TPR.

See more: Channel 10 11 News Lincoln Nebraska, 1011 Now (@1011_News)

Age dependence of IFR

The right panel that Fig. 5 mirrors our estimate of these DP-anchored IFR estimates. As we do the presumption of consistent IR for all period groups, we focus on the regions with high IR (>10%) whereby these presumptions are much more likely come hold. The most reliable data come from Lombardia and also Bergamo, together they are nearly complete, past the peak, and have a high number of statistics with little errors. The age-dependent IFR variety from listed below 0.04% because that ages listed below 50 year to 2.5%, 7.24%, and also 20% for eras 70–79, 80–89, and above 90 years, respectively, (Table 2). This is generally consistent through the estimates from the Hubei district in China, but says a steeper period dependence as displayed in Fig. 5 for Verity et al3,5,6,7,8. Analysis. Return the overall amplitude of our IFR approximates is anchored to DP, the relative age dependence is not.

Crude mortality rate per year traces IFR

In Table 2, us list the crude oil mortality rate per year (YMR), i.e., the fraction of the populace that on typical dies within a year for each age group and region. Us make an amazing observation the the YMR traces the IFR because that ages over 60 within 20% for various regions in Italy. One feasible explanation for this trend could be the the YMR takes into account differing prevalences the co-morbidities across different populace and ages. Many of the co-morbidities that reduce the basic life expectations might likewise lead come an raised risk of dice from COVID-19. We discover that this monitoring holds in other locations as well, for instance brand-new York City wherein the population YMR is 0.62% and United Kingdom9. This number is similar to the NYC IFR estimated from combining the age-dependent IFR from Italy data, and also consistent through a lower bound of 0.49% IFR estimated independently that Italian data—by combining the NYC COVID-19 PFR the 0.286% together of October 2020, v the preferably TPR that 0.58% which was got to in April 2020. Similarly, together of April 22 2020, with 9900 evidenced deaths, this IFR of 0.62% guess 19.3% IR. This is in an excellent agreement with the then estimated IR the 23.2% because that the NYC populace and 16% IR for 65+ years age group native seropositivity tests10. An additional interesting monitoring is that the age-dependent YMR additionally matches the proportion of deaths in various age groups. The proportion of YMR predicts COVID-19 mortality rate in period groups 45–64, 65–74, and above 75 to be 19%, 18%, and 55%, respectively11. This is in covenant with the current official NYC COVID-19 death fractions the 22%, 24%, and also 50%11. This numbers complement the greater death portion among the younger population that is it was observed in the US contrasted with some Italian regions favor Lombardia where only 8% of fatalities are in the period group