Project completed on May 16, 2024.
Using Reaction Time Survey dataset conduct a rigorous regression modeling and analysis to estimate student reaction times. The project outline is as follows (more details in project_report.pdf
):
- Abstract
- Exploratory Data Analysis (EDA)
2.1. Data Understanding
2.2. Data Insights
2.3. Data Pre-processing - Model Building
3.1. Variable Selection and Model Fitting
3.2. Diagnostics and Remedies
a) Unusual observations
b) Error assumptions
c) Structure assumptions - Model Comparison and Selection
4.1. A model with an interaction term
4.2. LASSO Regression - Discussion of Results and Conclusion
5.1. Summary
5.2. Challenges and Next Steps
5.3. Reflection on Lessons Learned
- Adjusted R^2
- VIF
- Pearson correlation
- ANOVA
- Cramer's V association
- Forward variable selection
- Lasso Regression
- Diagnostics/Remedies
- Mahalanobis Distance
- Studentized Residual Test
- Cook's Distance
- Q-Q plot / Shapiro-Wilk Test
- Residuals vs Fitted plot / Breusch-Pagan Test
- Residuals vs Index plot / Durbin-Watson Test
- Added-Variable plots
- Box-Cox transformation
survey.csv
-- raw dataset
survey_postEDA.csv
-- dataset after cleaning and preprocessing