Zillow Property Value Predictions

1. Introduction

In the rapidly evolving real estate market, accurate property valuation holds immense importance for homeowners and property investors alike. This project is dedicated to developing a predictive model for property values of Single Family Properties that underwent transactions during the year 2017. This project holds significant importance as it directly addresses the critical need to provide accurate property values. By predicting properety values, we aim to empower our users with valuable insights, aiding in informed decision-making and enhancing their overall experience on the Zillow platform.

2. Goals and Objectives

The primary goal of the Zillow Property Value Predictions project is to predict property values accurately. To achieve this goal, we have established the following objectives:

Data Collection and Preprocessing: Gather and clean Zillow property data to create a comprehensive dataset suitable for analysis.
Exploratory Data Analysis: Perform exploratory analysis to identify trends, patterns, and potential correlations related to the value of the properties.
Feature Importance Determination: Employ machine learning techniques to assess the importance of various features in predicting property values, aiding in identifying critical factors.
Model Building and Evaluation: Develop predictive models for property values, compare their performance, and select the most effective one for accurate value prediction.

3. Data Acquisition and Preparation

Data Sources:
1. Zillow Database: The primary source of property data will be the Zillow database. Data will be obtained via SQL queries, specifically using the predictions_2017 and properties_2017 tables.
2. US Census Data: To enrich our dataset with county and state-level information, we will use US Census data. This data will provide mapping for FIPS codes to county and state names, as the Zillow database does not contain this geographical information.
Data Collection: predictions_2017 (pred) table to filter properties that underwent transactions in 2017. then left join with the properties_2017 (prop) table to acquire the following key attributes: - pred.parcelid - prop.bedroomcnt - prop.bathroomcnt - prop.calculatedfinishedsquarefeet - prop.taxvaluedollarcnt - prop.yearbuilt - prop.fips
Data Preprocessing:
- Column Renaming: Certain columns will be renamed for clarity and consistency.
- FIPS Code Mapping: FIPS codes will be mapped to county and state names, enriching the dataset with geographical information.
- Data Cleaning: The dataset will undergo cleaning procedures, including the removal of rows with null values and rows where the values of either bedrooms or bathrooms are zero.
- Data Type Conversion: Selected columns will have their data types converted to integers for consistency in analysis.

4. Exploratory Data Analysis (EDA)

Data Overview:
- Present a summary of the dataset's characteristics (e.g., size, data types).
- Mention any initial observations or challenges.
Visualizations:
- Create visualizations to explore data distributions, trends, and patterns.
- Highlight key findings related to the project's objectives.
Feature Analysis:
- Investigate the impact of individual features on the target variable or outcomes.
- Identify correlations or relationships between features.
- Explore potential segments within the data.

5. Hypotheses Testing and Initial Questions

Hypotheses Formulation:
- Formulate hypotheses based on EDA insights and domain knowledge.
- Clearly define null and alternative hypotheses.
Initial Questions:
- List and address initial questions about the data or problem.
- Include any overarching questions that guide your analysis.

6. Feature Importance and Model Development

Feature Selection:
- Use EDA and hypotheses testing findings to select relevant features.
- Explain the criteria for feature selection.
Model Building:
- Develop predictive models using appropriate algorithms (e.g., regression, classification).
- Document the libraries, frameworks, or tools used for modeling.

7. Model Selection and Evaluation

Model Comparison:
- Compare the performance of multiple models using appropriate metrics.
- Present results using visualizations or tables.
Model Selection:
- Choose the best-performing model based on evaluation metrics and insights.
- Justify the selection with clear reasoning.

8. Model Testing and Generalization

Model Testing:
- Evaluate the selected model's performance on an independent test dataset.
- Assess its ability to generalize to new, unseen data.
Interpretation:
- Interpret model predictions and provide insights into the problem.
- Discuss the strengths and limitations of the chosen model.

9. Conclusion and Recommendations

Summary:
- Summarize the key findings and insights obtained from the data analysis.
- Revisit the project's goals and objectives.
Recommendations:
- Provide actionable recommendations or strategies based on the project's outcomes.
- Suggest steps or interventions for addressing the problem.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.scratchbooks		.scratchbooks
Readme.md		Readme.md
acquire.py		acquire.py
explore.py		explore.py
final_report.ipynb		final_report.ipynb
final_report_plus.ipynb		final_report_plus.ipynb
model.py		model.py
prepare.py		prepare.py
visuals.py		visuals.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Zillow Property Value Predictions

1. Introduction

2. Goals and Objectives

3. Data Acquisition and Preparation

4. Exploratory Data Analysis (EDA)

5. Hypotheses Testing and Initial Questions

6. Feature Importance and Model Development

7. Model Selection and Evaluation

8. Model Testing and Generalization

9. Conclusion and Recommendations

About

Releases

Packages

Languages

Jonathan-Garcia1/zillow_porperty_values

Folders and files

Latest commit

History

Repository files navigation

Zillow Property Value Predictions

1. Introduction

2. Goals and Objectives

3. Data Acquisition and Preparation

4. Exploratory Data Analysis (EDA)

5. Hypotheses Testing and Initial Questions

6. Feature Importance and Model Development

7. Model Selection and Evaluation

8. Model Testing and Generalization

9. Conclusion and Recommendations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages