Problem: To predict students’ grade as “pass” or “fail” before: (a) Mid-II, and, (b) Final exams.
For Mid-II grade prediction, use the following features: first four assignments, first four quizzes
and Mid-I scores; and, for grade prediction before final exam, use all the features (take best 5
assignments and quizzes).
Objectives: To answer the following two research questions.
- RQ-1: How accurately can we predict students’ grades before the Mid-II exam?
- RQ-2: How accurately can we predict students’ grades before Final exam?
Dataset: The dataset contains students’ assessment scores including <Assignments, Quizzes, Mid-I, Mid-II>, and a predictor variable . The data has been anonymized to hide identities of
the students and course(s). The data is shared on seven sheets (D1 to D7), where each sheet
contains a different number of assignments and quizzes. However, only the best 5 assignments and
quizzes are included for each student before calculating their grades. Also note that total marks for
assignments and quizzes are given on the top along their corresponding weights.
Perform exploratory data analysis (EDA) of the given dataset for understanding and preprocessing the data that might help you in the second phase of the project.
Model training and results reporting using the three classifiers (Nearest neighbor, Decision tree)
- Report data analyses that you performed through charts/tables.
- Report issue(s) that you identified and the corrective measure taken in pre-processing phase.
- Paste screenshots if a tool is used, submit the code otherwise.
- Details of data preprocessing steps (if performed).
- Model’s baseline accuracy.
- Results reporting: confusion matrix, performance evaluation metrics(accuracy, sensitivity, specificity, etc.), tables and/or charts.