Skip to content

Conducted a comprehensive clustering analysis to categorize beers based on features such as Astringency, Alcohol content, Bitterness, Sourness, and more. Utilized k-medoids and hierarchical agglomerative clustering algorithms to achieve this classification. Tech: Python (numpy, pandas, seaborn, matplotlib, sklearn, scipy)

Notifications You must be signed in to change notification settings

SaniyaAbushakimova/Brewing-Insights-with-Unsupervised-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Project completed on May 7, 2024.

Project description

In the world of beer, certain varieties stand out due to their versatile flavors, making them popular choices among consumers. A business owner aiming to meet popular demand needs to curate a simple yet appealing range of beers. However, given the overwhelming number of beer styles available, it is impractical to include every type in the inventory.

This project utilizes clustering analysis to assist the business owner in identifying a representative sample of beers. By examining various features of different beers (e.g. Astringency, Bitter, Alcohol etc), the analysis seeks to group them into distinct clusters, enabling the owner to select a diverse yet manageable assortment for their inventory.

Project outline

analysis_and_report.ipynb

  1. Introduction
  2. Dataset Discussion
  3. Dataset Cleaning and Exploration
  4. Basic Descriptive Analytics
  5. Scaling Decisions
  6. Clusterability and Clustering Structure
  7. Clustering Algorithm Selection Motivation
  8. Clustering Algorithm #1: K-Medoids
  9. Clustering Algorithm #2: HAC with Ward's Linkage
  10. Discussion
  11. Conclusion

Unsupervised Learning tools used in this project

  • Hopkin's Statistic
  • t-SNE plot
  • Elbow plot
  • Average Silhouette score
  • Silhouette plot
  • Cluster Sorted Similarity Matrix
  • K-Medoids Clustering
  • Hierarchical Agglomerative Clustering (HAC) with Single, Complete, Average, Ward's linkages
  • Dendrogram

Other details

beer_profile_and_ratings.csv -- raw dataset (retreived from Kaggle)

presentation.pdf -- a short presentation with the project overview

About

Conducted a comprehensive clustering analysis to categorize beers based on features such as Astringency, Alcohol content, Bitterness, Sourness, and more. Utilized k-medoids and hierarchical agglomerative clustering algorithms to achieve this classification. Tech: Python (numpy, pandas, seaborn, matplotlib, sklearn, scipy)

Topics

Resources

Stars

Watchers

Forks