Skip to content

alvmarrod/COVID-19-PRED

Repository files navigation

COVID-19 Prediction

This repo aims to do an analysis on COVID-19 spreading patterns based on worldwide data.

Raw - Data Sources

  1. As base data for COVID-19 in our analysis we're using Johns Hopkins repository. This data is stored into the Covid Folder, where we replicate the content from the time series reports. Don't panic, the data itself is downloaded by the script main.py whenever it runs. We just maintain a copy here in our repo.

  2. As population density data we're fetching the latest data available from World Population Review. Since this data is annual, we're keeping it stored in the repo without any change in Raw Population Density Folder. The script does use it directly, doesn't get it from elsewhere.

  3. As masks usage data have created a list hand-made using public domain knowledge. You can find more information and links both in our Medium story and our Raw Masks Usage Folder.

  4. As population risk we have collected the information that we can find regarding the presence of ACE2 cells in the human body depending of the population. You can find the raw information for Weighted Risk in the CSV in the Raw Population Risk Folder.

  5. As governments countermeasures data, we've collected them from different sources. Since this is the most subjective part, there has not been any data transformation, but directly features that will be in the next section. Please, find it in this separate readme, due to the large number of countries.

Features - Model Data

To know how raw data has been processed into features, please refer to the Medium story.

  1. COVID-19 data has been processed and saved from Raw Covid Folder to Feature Covid Folder.

  2. The population density data has been processed and saved from Raw Population Density Folder to Feature Population Density Folder.

  3. The masks usage has been processed and saved from Raw Masks Usage Folder to Feature Masks Usage Folder.

  4. The weighted population risk that we have calculated has been processed as well and saved from Raw Population Risk Folder into the Features Population Risk Folder.

  5. The governments measures that we have collected as explained in the Medium story, has been directly copied in the Features Governments Measures Folder.

Dependencies

Please use the requirements.txt file available to install all the these dependencies (except Pytorch, that you should install yourself, please see the link below) using:

python -m pip install -r requirements.txt

Thanks to

As you may find out going through the project, I would like to thanks to:

  1. Pablo Gomez, whose effort in collecting needed data made it possible for the project to happen. Specifically:
  1. Ricardo Villalobos, who helped us with:
  1. Javier VGD, who helped us with:
  • Reviewing the Medium story
  • Some code challenges or problems.

High five guys!