Studying the use of generalized linear models applied to extreme values of air quality data in the region of Grande Vitória, Brazil, ES
The idea of this project is to study the impact of air pollution on children's hospitalization for respiratory diseases.
The idea worked out using Wavelets, in past work already published.
However, some research questions remain. One of them was about the overdispersion of the data. From there came the opportunity to study the application of modeling techniques for extreme values.
The initial plan is:
1 - Exploratory analysis 2 - Verify adherence of the variables to an extreme value distribution 3 - Compare the application of VGAM and GAMLSS models, which are not necessarily linear models, but are attractive in the literature.
Data sources:
-
Datasus (codes present in the article download the base): a Brazilian government database with information related to the public health system, in the present project we will use hospital admissions of children in the region of Vitória, state of Espírito Santo, Brazil.
-
IEMA (https://iema.es.gov.br/qualidadedoar/dadosdemonitoramento/automatica): a database of pollutants from the study region, collected by a government agency.
Future and parallel projects that will collaborate with the implementation
- Automate the ETL of data from the data.
- Automate the treatment of the missing data from the databases
- Automating the application of models for prediction (Machine Learning)
Translated with www.DeepL.com/Translator (free version)