Smart Search is a project aimed at implementing an advanced search feature for free courses available on the Analytics Vidhya platform. The goal is to enable users to find the most relevant courses based on natural language queries using state-of-the-art machine learning techniques.
The objective of this project is to implement a Smart Search feature that allows users to find the most relevant courses on the Analytics Vidhya platform based on their natural language queries. This is achieved using LangChain, FAISS, and Sentence Transformers to provide an efficient, scalable, and accurate search experience.
- Advanced search algorithms
- User-friendly interface
- Real-time search results
- Customizable search parameters
- LangChain
- FAISS
- Sentence Transformers (all-MiniLM-L6-v2)
- Streamlit
- Data Collection & Preprocessing: Scraping course data from Analytics Vidhya.
- Embedding Generation: Converting textual data into numerical vectors using Sentence Transformers.
- FAISS Indexing: Building an efficient similarity search system.
- Integration with Streamlit: Developing a user-friendly search interface.
- Ranking and Result Presentation: Displaying relevant courses based on cosine similarity.
AnalyticsVidhya-SmartSearch/
│
├── course_details.json # Scraped course data
├── scrape_courses.py # Data scraping script
├── index_courses.py # Embedding and indexing script
├── app.py # Streamlit application for smart search
├── course_faiss.index # FAISS index for efficient search
├── requirements.txt # Dependencies for the project
├── .gitignore # Git ignore file to exclude unnecessary files
└── README.md # Project documentation and instructions
- Data Collection: Scraping course titles, descriptions, curriculum, and additional information.
- Embedding Generation: Using Sentence Transformers to generate embeddings.
- FAISS Indexing: Creating an efficient and scalable search index.
- Streamlit Integration: Implementing a search interface for user interaction.
- Ranking Mechanism: Using cosine similarity to rank and display courses.
- Python 3.7 or higher
- pip (Python package installer)
- Clone the repository:
git clone https://github.com/Satyamkumarnavneet/Analyticsvidhya-SmartSearch.git
- Navigate to the project directory:
cd Analyticsvidhya-SmartSearch
- Install dependencies:
pip install -r requirements.txt
- Run the application:
streamlit run app.py
- FAISS-based indexing ensures fast search results even for large datasets.
- The system is scalable and can handle a large number of courses efficiently.
- Open your web browser and go to
http://localhost:8501
. - Enter your search query in the search bar.
- Customize search parameters if needed.
- View the search results in real-time.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
For support, email [email protected]
.