Project completed on August 24, 2023.
Kazhydromet serves as the national hydrometeorological agency of the Republic of Kazakhstan, offering comprehensive hydrological data sourced from 227 stations distributed across the country.
The Kazhydromet-Web-Scraping automates the retrieval of data from Kazhydromet's Meteorological Database spanning from 01/01/2000 to 30/04/2023. Manually handling this would be extremely time-consuming because the database contains 3.5 GB of tabular data, which is a very large number of tables. This process was automated using Python
and Selenium
framework.
About the website:
- The website URL is: https://www.kazhydromet.kz/ru/. By default, the language is set to Russian, but it can be switched to English.
- Following the guide below, we can access the target database for scraping:
About the database:
- The database includes meteorological indices such as Temperature, Partial Pressure, Relative Humidity, and 8 other indices.
- Data is categorized by regions (17 in total across Kazakhstan) and respective stations within each region (totaling 227 stations).
- Our goal is to scrape data from 01/01/2000 to 30/04/2023.
- Each table contains approximately 8,521 entries, each weighing around 800 KB. Below is a screenshot of the database with comments.
- To launch the script:
python .\download_data.py
kazgydromet_data/темп/возд
-- a sample of the data that was successfully scraped from the Kazhydromet Database.station_geocode/nominatim.ipynb
-- provides geographic coordinates for each hydrological station run by Kazhydromet.