Skip to content

Jonathan-Garcia1/zillow_porperty_values

Repository files navigation

Zillow Property Value Predictions

1. Introduction

In the rapidly evolving real estate market, accurate property valuation holds immense importance for homeowners and property investors alike. This project is dedicated to developing a predictive model for property values of Single Family Properties that underwent transactions during the year 2017. This project holds significant importance as it directly addresses the critical need to provide accurate property values. By predicting properety values, we aim to empower our users with valuable insights, aiding in informed decision-making and enhancing their overall experience on the Zillow platform.

2. Goals and Objectives

The primary goal of the Zillow Property Value Predictions project is to predict property values accurately. To achieve this goal, we have established the following objectives:

  • Data Collection and Preprocessing: Gather and clean Zillow property data to create a comprehensive dataset suitable for analysis.

  • Exploratory Data Analysis: Perform exploratory analysis to identify trends, patterns, and potential correlations related to the value of the properties.

  • Feature Importance Determination: Employ machine learning techniques to assess the importance of various features in predicting property values, aiding in identifying critical factors.

  • Model Building and Evaluation: Develop predictive models for property values, compare their performance, and select the most effective one for accurate value prediction.

3. Data Acquisition and Preparation

  • Data Sources:

    1. Zillow Database: The primary source of property data will be the Zillow database. Data will be obtained via SQL queries, specifically using the predictions_2017 and properties_2017 tables.

    2. US Census Data: To enrich our dataset with county and state-level information, we will use US Census data. This data will provide mapping for FIPS codes to county and state names, as the Zillow database does not contain this geographical information.

  • Data Collection: predictions_2017 (pred) table to filter properties that underwent transactions in 2017. then left join with the properties_2017 (prop) table to acquire the following key attributes: - pred.parcelid - prop.bedroomcnt - prop.bathroomcnt - prop.calculatedfinishedsquarefeet - prop.taxvaluedollarcnt - prop.yearbuilt - prop.fips

  • Data Preprocessing:

    • Column Renaming: Certain columns will be renamed for clarity and consistency.

    • FIPS Code Mapping: FIPS codes will be mapped to county and state names, enriching the dataset with geographical information.

    • Data Cleaning: The dataset will undergo cleaning procedures, including the removal of rows with null values and rows where the values of either bedrooms or bathrooms are zero.

    • Data Type Conversion: Selected columns will have their data types converted to integers for consistency in analysis.

4. Exploratory Data Analysis (EDA)

  • Data Overview:

    • Present a summary of the dataset's characteristics (e.g., size, data types).
    • Mention any initial observations or challenges.
  • Visualizations:

    • Create visualizations to explore data distributions, trends, and patterns.
    • Highlight key findings related to the project's objectives.
  • Feature Analysis:

    • Investigate the impact of individual features on the target variable or outcomes.
    • Identify correlations or relationships between features.
    • Explore potential segments within the data.

5. Hypotheses Testing and Initial Questions

  • Hypotheses Formulation:

    • Formulate hypotheses based on EDA insights and domain knowledge.
    • Clearly define null and alternative hypotheses.
  • Initial Questions:

    • List and address initial questions about the data or problem.
    • Include any overarching questions that guide your analysis.

6. Feature Importance and Model Development

  • Feature Selection:

    • Use EDA and hypotheses testing findings to select relevant features.
    • Explain the criteria for feature selection.
  • Model Building:

    • Develop predictive models using appropriate algorithms (e.g., regression, classification).
    • Document the libraries, frameworks, or tools used for modeling.

7. Model Selection and Evaluation

  • Model Comparison:

    • Compare the performance of multiple models using appropriate metrics.
    • Present results using visualizations or tables.
  • Model Selection:

    • Choose the best-performing model based on evaluation metrics and insights.
    • Justify the selection with clear reasoning.

8. Model Testing and Generalization

  • Model Testing:

    • Evaluate the selected model's performance on an independent test dataset.
    • Assess its ability to generalize to new, unseen data.
  • Interpretation:

    • Interpret model predictions and provide insights into the problem.
    • Discuss the strengths and limitations of the chosen model.

9. Conclusion and Recommendations

  • Summary:

    • Summarize the key findings and insights obtained from the data analysis.
    • Revisit the project's goals and objectives.
  • Recommendations:

    • Provide actionable recommendations or strategies based on the project's outcomes.
    • Suggest steps or interventions for addressing the problem.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published