Merge pull request #279 from PrashantDixit0/main

restructuring: recommendation sys notebooks
lancedb · Dec 28, 2024 · eb5a4fd · eb5a4fd
2 parents 5771038 + ef125f1
commit eb5a4fd
Show file tree

Hide file tree

Showing 8 changed files with 1,313 additions and 1,263 deletions.
diff --git a/README.md b/README.md
@@ -162,7 +162,7 @@ Create a recommender system application with LanceDB for efficient vector-based
 | [Movie Recommender](/examples/movie-recommender/) | <a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/movie-recommender/main.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> [![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54)](./examples/movie-recommender/main.py) [![beginner](https://img.shields.io/badge/beginner-B5FF33)](#)|  |
 | [Product Recommender](./examples/product-recommender/) | <a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/product-recommender/main.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> [![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54)](./examples/product-recommender/main.py) [![intermediate](https://img.shields.io/badge/intermediate-FFDA33)](#)| |
 | [Arxiv paper recommender](/examples/arxiv-recommender) | <a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/arxiv-recommender/main.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> [![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54)](./examples/arxiv-recommender/main.py) [![LLM](https://img.shields.io/badge/local-llm-green)](#)  [![beginner](https://img.shields.io/badge/beginner-B5FF33)](#)|  |
-| [Music Recommender](/applications/Music_Recommandation/) | <a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/applications/Music_Recommandation/lancedb_music_recommandation.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> [![intermediate](https://img.shields.io/badge/intermediate-FFDA33)](#)| |||||
+| [Music Recommender](/applications/Music_Recommendation/) | [![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54)](./applications/Music_Recommendation/app_music.py) [![intermediate](https://img.shields.io/badge/intermediate-FFDA33)](#)| |||||
 ||||
 
 ### Concepts

diff --git a/...andation/Music_recommndations_lancedb.png → ...endation/Music_recommndations_lancedb.png b/...andation/Music_recommndations_lancedb.png → ...endation/Music_recommndations_lancedb.png
diff --git a/applications/Music_Recommandation/README.md → applications/Music_Recommendation/README.md b/applications/Music_Recommandation/README.md → applications/Music_Recommendation/README.md
diff --git a/...cations/Music_Recommandation/app_music.py → ...cations/Music_Recommendation/app_music.py b/...cations/Music_Recommandation/app_music.py → ...cations/Music_Recommendation/app_music.py
diff --git a/...ons/Music_Recommandation/requirements.txt → ...ons/Music_Recommendation/requirements.txt b/...ons/Music_Recommandation/requirements.txt → ...ons/Music_Recommendation/requirements.txt
diff --git a/examples/arxiv-recommender/main.ipynb b/examples/arxiv-recommender/main.ipynb
diff --git a/examples/movie-recommender/main.ipynb b/examples/movie-recommender/main.ipynb
diff --git a/examples/product-recommender/main.ipynb b/examples/product-recommender/main.ipynb
@@ -8,22 +8,39 @@
       "source": [
         "# Product Recommender using Collaborative Filtering and LanceDB\n",
         "\n",
-        "We are going to use **LanceDB** and **Collaborative Filtering** to recommend products based on a user's past buying history. We used the <a href=\"https://www.kaggle.com/datasets/yasserh/instacart-online-grocery-basket-analysis-dataset\">**Instacart dataset**</a> as our data for this example.\n",
+        "Collaborative filtering is a method to recommend movies by analyzing user preferences. It works by finding patterns in what users like. For example:\n",
+        "\n",
+        "1. **User-based filtering**: If two users have similar tastes, movies liked by one can be suggested to the other.\n",
+        "\n",
+        "2. **Item-based filtering**: If two movies are often liked together, recommending one suggests the other.\n",
+        "\n",
+        "This approach uses past data, like movie ratings, to predict what someone might enjoy.\n",
+        "\n",
         "\n",
         "![picture](https://daxg39y63pxwu.cloudfront.net/images/blog/product-recommendation-system-projects/Product_Recommendation_System_Project_Ideas_and_Examples.png)"
       ]
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "In this example, we’ll use **LanceDB** and **Collaborative Filtering** to recommend products based on a user's purchase history. The data comes from the <a href=\"https://www.kaggle.com/datasets/yasserh/instacart-online-grocery-basket-analysis-dataset\">**Instacart dataset**</a>."
+      ],
+      "metadata": {
+        "id": "aExXL-0eFCC8"
+      }
+    },
     {
       "cell_type": "markdown",
       "metadata": {
         "id": "lXd46ecEt5G7"
       },
       "source": [
-        "To downloading dataset in this example, you must have a Kaggle account.\n",
+        "## Download dataset from Kaggle\n",
         "\n",
+        "To downloading dataset in this example, you must have a Kaggle account.\n",
         "To get the Kaggle API credentials,\n",
         "\n",
-        "Go to the Your Profile -> Settings -> Create Token\n",
+        "Go to the Your ***Profile -> Settings -> Create Token***\n",
         "\n",
         "This will download `kaggle.json`, a file containing your API credentials.\n",
         "\n",
@@ -33,6 +50,7 @@
     {
       "cell_type": "code",
       "source": [
+        "# install and copy credentials for downloading dataset\n",
         "! pip install kaggle\n",
         "! mkdir ~/.kaggle\n",
         "! cp kaggle.json ~/.kaggle/\n",
@@ -45,7 +63,7 @@
           "base_uri": "https://localhost:8080/"
         }
       },
-      "execution_count": 1,
+      "execution_count": null,
       "outputs": [
         {
           "output_type": "stream",
@@ -79,7 +97,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 2,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
@@ -167,7 +185,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 3,
+      "execution_count": null,
       "metadata": {
         "id": "emp_MSXZt5G8"
       },
@@ -197,7 +215,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 4,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
@@ -231,7 +249,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 5,
+      "execution_count": null,
       "metadata": {
         "id": "f3g296nL4_zZ"
       },
@@ -261,7 +279,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 6,
+      "execution_count": null,
       "metadata": {
         "id": "cBbbR7Rut5G_"
       },
@@ -294,7 +312,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 7,
+      "execution_count": null,
       "metadata": {
         "id": "ZjRh7RYpt5HB"
       },
@@ -325,7 +343,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 8,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/",
@@ -649,7 +667,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 9,
+      "execution_count": null,
       "metadata": {
         "id": "v2_2R7zmt5HE"
       },
@@ -693,14 +711,14 @@
         "id": "JDwIxGMnqNx8"
       },
       "source": [
-        "# Difference between colloborative and content filtering\n",
+        "# Difference between Collaborative and Content-based filtering\n",
         "\n",
         "![picture](https://miro.medium.com/v2/resize:fit:1400/0*R8qw_CXxCc4600bQ.png)"
       ]
     },
     {
       "cell_type": "code",
-      "execution_count": 10,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/",
@@ -776,7 +794,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 11,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/",
@@ -845,7 +863,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 12,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
@@ -922,7 +940,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 13,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
@@ -1037,13 +1055,13 @@
         "id": "38rssdYCBR4E"
       },
       "source": [
-        "## Let's save the data and create a empty LanceDB Table using a Pydantic model.\n",
+        "### Let's save the data and create a empty LanceDB Table using a Pydantic model\n",
         "A Table is designed to store large numbers of columns and huge quantities of data! For those interested, a LanceDB is columnar-based, and uses Lance, an open data format to store data."
       ]
     },
     {
       "cell_type": "code",
-      "execution_count": 14,
+      "execution_count": null,
       "metadata": {
         "id": "3_ykVLT6t5HH"
       },
@@ -1054,7 +1072,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 15,
+      "execution_count": null,
       "metadata": {
         "id": "ufHsF0o4t5HI"
       },
@@ -1082,7 +1100,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 16,
+      "execution_count": null,
       "metadata": {
         "id": "NOOPF9zOt5HJ"
       },
@@ -1104,12 +1122,12 @@
         "id": "j3aU4z-tSbWE"
       },
       "source": [
-        "## Let's create an ANN index in order to speed up retrieval. This might take a while."
+        "### Let's create an ANN index in order to speed up retrieval. This might take a while."
       ]
     },
     {
       "cell_type": "code",
-      "execution_count": 17,
+      "execution_count": null,
       "metadata": {
         "id": "H8HyvjCFSeaz"
       },
@@ -1130,7 +1148,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 18,
+      "execution_count": null,
       "metadata": {
         "id": "Uzgk5Od0t5HK"
       },
@@ -1164,7 +1182,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 19,
+      "execution_count": null,
       "metadata": {
         "id": "Wwl7yFKTt5HK"
       },
@@ -1180,12 +1198,12 @@
         "id": "wTh61ou3t5HL"
       },
       "source": [
-        "## Let's now query LanceDB to retrieve recommendations."
+        "### Let's now query LanceDB to retrieve recommendations."
       ]
     },
     {
       "cell_type": "code",
-      "execution_count": 20,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/",
@@ -2408,6 +2426,15 @@
         "    display(results)\n",
         "    display(products_bought_by_user_in_the_past(id, top=15))"
       ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Tada!! your first Product recommendation system is live"
+      ],
+      "metadata": {
+        "id": "SMjD6-nIFt9F"
+      }
     }
   ],
   "metadata": {