-
This project aims to develop a machine learning model capable of accurately predicting pizza prices based on various factors such as size, toppings, and company.
-
By analyzing a dataset of pizza orders, we explore the relationships between these variables and the price of the pizza.
-
The primary objective is to create a reliable model that can assist businesses in pricing their pizza offerings competitively and profitably.
-
This model can also provide valuable insights into consumer preferences and market trends.
-
The project follows a structured methodology, including data exploration, preprocessing, feature engineering, model selection, training, evaluation, and deployment.
-
Through this process, I aim to build a robust and effective machine learning solution for pizza price prediction.
-
A.Import Libraries and data screening 1.First 5 Rows
2.Shape of the Data Set
3.Get Information About Our Dataset Like Total Number Rows, Total Number of Columns, Datatypes of Each Column And Memory Requirement bold text
-
B.Data cleaning
- 5.Check Null Values In The Dataset bold text
- Data Preprocessing -○ Rename the column price_rupiah to KES price (TO the currency of your choice). ○ Remove 'Rp' and commas from the price column and converted it to integers. ○ Converted the price column from Indonesian Rupiah to Kes. ○ Removed 'inch' from the diameter column and converted it to floats. ○ Displayed the first five rows after preprocessing using pizza_df.head().
- Conduct Univariate Analysis for the columns:Company,Price,Diameter,Topping,Variant,Size,Extra Sauce & Extra Cheese
- Conduct Bivariate Analysis: Price by Company Price by topping Price by size
- Find the most expensive pizza 11.Find diameters of jumbo size pizza
- Find diameters of XL size pizza 13.Remove outliers in the jumbo size pizzas
- Label Encoding
- Create the Feature Matrix X and Response(Target) Vector y 16.Split the data into Train and Test sets 17.Building the Models
- Training the models
- Prediction on Test Data
- Model Evaluation
- Feature Importance 21.a. Random Forest 21.b.Gradient Boosting Regressor 21.c. XGBRegressor
- Saving The Best Model