Country Analysis - Italy

The main goal of this analysis is to provide a visual and updated overview of the Covid-19 outbreak in Italy including:

  • Time trend overview on national level.
  • Overall numbers in a nutshell.
  • Analysis on regional level including insights on the four regios with the most active cases.
  • Time trend overview on Lombardy: the main hotbed of the Italian outbreak.

The data is updated on daily basis.

Data cleaning and formatting

We define a function to get the raw data from the corresponding url and retruning it as a dataframe.
This function allows us to get the latest updates.

Import the latest reports from the Presidenza del Consiglio dei Ministri - Dipartimento della Protezione Civile.
There are four datasets:

  1. da_national: National data (time series).
  2. da_region: Regional data (time series).
  3. da_daily_region: Latest daily data on regional level.
  4. da_province: Province data.


Date of the latest data to be gathered is yesterday to ensure the data is available as the files are updated daily at midnight.
We use the date variable "ieri" to build the urls dynamically. The date format used in the Italian repository differs from the one used by the CSSE at Johns Hopkins University, this is why we need to build it.

We rename the columns to translate them to English.

timestamp state hospitalized with symptoms intensive care total hospitalized self-isolation total currently positive variation of total positive new positive recovered fatalities casi_da_sospetto_diagnostico casi_da_screening total positive total tested casi_testati note ingressi_terapia_intensiva note_test note_casi totale_positivi_test_molecolare totale_positivi_test_antigenico_rapido tamponi_test_molecolare tamponi_test_antigenico_rapido
2020-02-24T18:00:00 ITA 101 26 127 94 221 0 221 1 7 nan nan 229 4324 nan nan nan nan nan nan nan nan nan
2020-02-25T18:00:00 ITA 114 35 150 162 311 90 93 1 10 nan nan 322 8623 nan nan nan nan nan nan nan nan nan
date state regional code region lat long hospitalized with symptoms intensive care total hospitalized self-isolation total currently positive variation of total positive new positive recovered fatalities casi_da_sospetto_diagnostico casi_da_screening total positive total tested casi_testati note ingressi_terapia_intensiva note_test note_casi totale_positivi_test_molecolare totale_positivi_test_antigenico_rapido tamponi_test_molecolare tamponi_test_antigenico_rapido codice_nuts_1 codice_nuts_2
2021-09-14T17:00:00 ITA 2 Valle d'Aosta 45.737503 7.320149 2 0 2 78 80 5 10 11511 473 nan nan 12064 175822 83284.000000 nan 0.000000 nan nan 11261.000000 803.000000 107518.000000 68304.000000 ITC ITC2
2021-09-14T17:00:00 ITA 5 Veneto 45.434905 12.338452 216 56 272 12317 12589 -295 427 438365 11728 nan nan 462682 11179416 2054238.000000 nan 8.000000 nan nan 448634.000000 14048.000000 6505074.000000 4674342.000000 ITH ITH3
date state regional code region lat long hospitalized with symptoms intensive care total hospitalized self-isolation total currently positive variation of total positive new positive recovered fatalities casi_da_sospetto_diagnostico casi_da_screening total positive total tested casi_testati note ingressi_terapia_intensiva note_test note_casi totale_positivi_test_molecolare totale_positivi_test_antigenico_rapido tamponi_test_molecolare tamponi_test_antigenico_rapido codice_nuts_1 codice_nuts_2
2021-09-13T17:00:00 ITA 13 Abruzzo 42.351222 13.398438 75 7 82 2090 2172 -65 26 75496 2535 nan nan 80203 2123062 819951 nan 1 nan nan 80203 0 1365008 758054 ITF ITF1
2021-09-13T17:00:00 ITA 17 Basilicata 40.639471 15.805148 49 4 53 1264 1317 -25 6 27572 606 nan nan 29495 449109 232458 Il numero totale dei decessi ne comprende n. 23 a carico di pazienti non residenti, deceduti in strutture ospedaliere della Regione Basilicata. 0 nan nan 29495 0 427418 21691 ITF ITF5
<function matplotlib.pyplot.show(*args, **kw)>

Active cases keep growing on national level, however the growth of new cases has been slowing down since the beginning of April. If the currently positive curve continues to follow the trends from China and Korea, the growth could be expected until at least one month since the countrywide lockdown started.

Country numbers in a nutshell

Overall status as on 09/14/2021:
total tested total positive total currently positive recovered self-isolation total hospitalized intensive care fatalities death rate [%]
87892474 4613214 122340 4360847 117621 4719 554 130027 2.818577

From all tested population we have the following figures in terms of percentage:

total positive total currently positive recovered self-isolation total hospitalized intensive care fatalities
5.250000 0.140000 4.960000 0.130000 0.010000 0.000000 0.150000

Analysis on regional level

Summary data per country is shown if you point on each country's block.

Top four regions with most active cases

Lombardia is the region with the bigest share of cases of those currently active.
As the outbreak hit Lombardia first, the number of fatalities and recovered cases are higher, explaining the lower number of currently cases in comparison to the total of cases. As of 30.03 the number of active cases in the region is slowing down.
In the other regions, the outbreak started more recently, therefore the actual positive and total positive in those regions are very close to each other and the curve of active cases keeps growing.

region total tested total positive total currently positive percentage of active [%] percentage of total positive [%]
Sicilia 6006805 289329 26014 21.260000 6.270000
Emilia-Romagna 8352278 418257 14948 12.220000 9.070000
Veneto 11132482 462255 12884 10.530000 10.020000
Lazio 8749592 379291 12607 10.300000 8.220000

Time trend in Lombardia

<function matplotlib.pyplot.show(*args, **kw)>

As of 31.03.20, the peak of active cases is registered on 29.03.20. However, the number of active cases started to grow again in the first week of April, so the peak of cases hasn't been reached yet.
Looking at the trends in China and Korea, for the former the trend flattened for about five days after the peak before starting to go down; for the latter, there were 5 days of oscillations after the peak value until the curve started to go down steadily.