Skip to content

Latest commit

 

History

History
40 lines (20 loc) · 2.34 KB

README.md

File metadata and controls

40 lines (20 loc) · 2.34 KB

SPEED DATING

Challenge description

We will start a new data visualization and exploration project. Your goal will be to try to understand love! It's a very complicated subject so we've simplified it. Your goal is going to be to understand what happens during a speed dating and especially to understand what will influence the obtaining of a second date.

This is a Kaggle competition on which you can find more details here :

Speed Dating Dataset

Take some time to read the description of the challenge and try to understand each of the variables in the dataset. Help yourself with this from the document : Speed Dating - Variable Description.md

Rendering

To be successful in this project, you will need to do a descriptive analysis of the main factors that influence getting a second appointment.

Over the next few days, you'll learn how to use python libraries like seaborn, plotly and bokeh to produce data visualizations that highlight relevant facts about the dataset.

For today, you can start exploring the dataset with pandas to extract some statistics.

SPEED DATING - PART II

Let's try to produce our first visualisations with Seaborn. Based on the exploration you realized, try to find good representations of the dataset that allow to summarize some statistics as well as relationships between variables.

SPEED DATING - PART III

Continue exploring the Speed Dating dataset. You can use Seaborn and/or Matplotlib in case you want to master the rendering of the visualizations : for example, try to superimpose different plots, change the color palettes, add some texts, etc,...

One thing to remember : in EDA, there's not one only way to highlight relevant information ! And there are choices to be made about the aspects you'd like to explore in this dataset, so don't hesitate to be creative :-)

SPEED DATING - PART IV

Let's finalize our project by adding some interactivity to it ! Try using Plotly to make at least one interactive graph.

In the end of the afternoon, some of you will present their visualizations in front of the class. Your teacher will give you detail about the organization.

The aim is to share your findings with your classmates and practice your presentation skills : don't hesitate to make it dynamic and fun !

To finish, if you'd like to see many code examples and EDAs of this dataset, you can visit this page