A comprehensive dashboard for analyzing aviation crash data with AI-powered risk prediction and interactive visualizations.
-
Data Collection & Integration
- Multi-source data collection (NTSB, FAA, ASN)
- Automated data updates
- MongoDB integration for efficient storage
- Data standardization and cleaning
-
Interactive Dashboard
- Real-time crash data visualization
- Advanced filtering options
- Trend analysis
- Aircraft type and operator statistics
- Export functionality for filtered data
-
AI-Powered Analysis
- Risk prediction for incidents
- Feature importance analysis
- Confidence scoring
- Natural language processing for crash descriptions
-
Performance Optimizations
- Multi-level caching system
- Database indexing
- Efficient query optimization
- Background processing for long-running tasks
-
Clone the repository:
git clone https://github.com/Mizokuiam/Aviation-Crash-AIAnalysis.git cd pycrashai
-
Create and activate a virtual environment:
python -m venv venv venv\Scripts\activate # Windows source venv/bin/activate # Linux/Mac
-
Install dependencies:
pip install -r requirements.txt
-
Set up MongoDB:
- Install MongoDB Community Edition
- Start MongoDB service
- Default connection: mongodb://localhost:27017/
pycrashai/
├── data/ # Data storage
│ └── raw/ # Raw collected data
├── models/ # Trained ML models
├── src/
│ ├── analysis/ # Data analysis modules
│ ├── collectors/ # Data collection modules
│ ├── dashboard/ # Dashboard application
│ ├── database/ # Database operations
│ └── models/ # AI model definitions
├── tests/ # Unit tests
└── requirements.txt # Project dependencies
-
Start the dashboard:
python src/dashboard/app.py
-
Access the dashboard at
http://127.0.0.1:8050/
-
Available features:
- View crash statistics and trends
- Filter data by date, severity, aircraft type, etc.
- Export filtered data to Excel
- Analyze risk factors using AI predictions
- View geographical distribution of incidents
- Environment variables can be set in
.env
:MONGODB_URI=mongodb://localhost:27017/ DATA_DIR=data/raw MODEL_DIR=models
-
Caching System:
- Source data cache (5 min TTL)
- Query results cache (1 min TTL)
- Statistics cache (5 min TTL)
-
Database Optimization:
- Indexed fields: date, source, severity, aircraft_type, operator
- Text search capabilities for descriptions
- Efficient aggregation pipelines
-
UI Optimizations:
- Lazy loading of components
- Pagination for large datasets
- Background processing for computations
-
Run tests:
pytest tests/
-
Code style:
flake8 src/ black src/
-
Core:
- Python 3.11+
- MongoDB 4.4+
- Dash 2.14.1
- Pandas 2.1.4
- Scikit-learn 1.3.2
-
AI/ML:
- TensorFlow 2.18.0
- SHAP 0.44.0
- Spacy 3.8.3
-
Data Processing:
- NumPy 1.26.2
- Plotly 5.18.0
- XlsxWriter 3.1.9
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- NTSB for providing aviation accident data
- FAA for regulatory information
- Aviation Safety Network for additional data sources