An open-source dataset of book recommendations using Jupyter Notebooks to clean, merge, analyze and provide a recommender system to effectively compete and maintain relevance in today's dynamic book market for traditional book publishing houses.
I selected this dataset based on its alignment with the outlined project brief requirements from CareerFoundry, and it holds the potential for creating a compelling and insightful project within the book industry.
Implement a feedback loop within the recommender system, enabling users to rate and provide feedback on suggested books: The goal is to utilize this valuable user data for a continual enhancement of recommendations, ensuring adaptability to the changing tastes and preferences of readers over time.
The owner of the dataset is Möbius on Kaggle. The data was compiled by Cai-Nicolas Ziegler in a 4-week crawl (August/September 2004) from the Book-Crossing community with kind permission from Ron Hornbaker, CTO of Humankind Systems.
The data is static and won't undergo updates. Due to the absence of integrated revenue data, making financial predictions is not feasible.
The dataset contains 278,858 users (anonymized but with demographic information) providing 1,149,780 ratings (explicit/implicit) about 271,379 books, thus respecting privacy and being open source.
Language: Python
Libraries: Pandas, NumPy, Matplotlib, sklearn, train_test_split, LinearRegression, mean_squared_error, r2_score, KMeans, seaborn, folium, json
Software: Jupyter Notebooks, Tableau, Canva